Chapter 6 Introduction to RStudio

In this chapter we will show you how to use and navigate the RStudio interface. You will learn about the different tools and functions contained within the interface, and how to set up a project directory to easily manage and organize your workflow.

6.1 RStudio Interface

When you open RStudio, you will see that it automatically launches R software. You will not need to open the R gui installed in the previous chapter. Each time you open RStudio, a new session is started. A session represents the everything you did from opening to exiting RStudio.

You are now looking at the working window, or the RStudio interface.

INCLUDE A PICTURE OF THE INTERFACE HERE

The default interface, or working window, can be broken down into 3 main areas:

  1. Left Side: The entire left pane of the working window includes the Console, Terminal, and Background Jobs tabs.
  2. Top-Right: The top right pane of the working window is made up of the Environment, History, Connections, and Tutorial tabs.
  3. Bottom-Right: The bottom right pane of the working window is where you will find the majority of resources available within RStudio, this includes the Files, Plots, Packages, Help, Viewer, and Presentation tabs.

There will be some tabs that are used more than others, we will go over some of the more frequently used tabs and teach you about some of the ways you can find help or solve issues on your own directly within RStudio.

6.1.1 Console

The Console includes some information about the version of R that is being used, and provides commands that you can run to learn more about the software and quick demos.

Below the description you will see, >. This is where you can type your code, execute your code by pressing Enter, and see the result.

Although you can exclusively type code into the console to run, it is not saved. This means that if you exit out of RStudio and re-open it, the only way to reproduce what you did before is if you remember exactly what you did.

To avoid this, we will teach you about scripts in a later section of this chapter.

INCLUDE A PICTURE HERE

6.1.2 Environment

When we code we create different objects, this could be Data, or Values. Once they are created they are stored in your environment where you can refer back to or use the objects again.

It is helpful to think of the environment similar to a short-term memory. Although objects are stored and able to be reused, once the RStudio session is finished and a new one is created, the environment resets.

When you are working it is easy to become overwhelmed with the amount of things stored in your environment. If you look to the right of the environment toolbar, you can change the way objects are displayed or organized. By selecting the List option, your objects are organized alphabetically within the categories mentioned above. The Grid option list all objects alphabetically and provides the type, length, and size of the objects.

INCLUDE A PICTURE HERE

6.1.3 Files & Plots

The files tab in the Bottom-Right pane is where you will have direct access to the files found on your computer. You will be able to create, delete, move, copy, or rename files and folders using the toolbar found along the top of the files tab. it is a direct reflection of the files on your computer, meaning anything done in this tab will remain once the session is terminated.

Access to this tab is a great way to make sure things are saved in the correct folder, or where to find them.

The plots tab is where you will find all of the plots created in the current session. Each time a new plot is created, it is displayed in place of the previous. You can easily navigate to back and forth through previous plots, or export plots, using the options found in the toolbar along the top of the plots tab.

6.1.4 Help & Tutorial

The help tab is the most useful feature in RStudio. Every function (more on these later) you have access to, is accompanied by a help page that explains what the function does, how to use it, the result, and examples.

If you know the name of the function you are trying to use, simply type it into the search bar found in the right side of the tool-bar along the top of the help tab. Once you find it, you can read through each section and hopefully find answers to any problems you are having.

INCLUDE A PICTURE HERE

The tutorial tab is similar to the help tab. Instead of explaining how to use specific functions, it provides you with various tutorials for different tasks or things you can do using R. it provides you with step by step instructions and often includes interactive components.

INCLUDE PICTURE HERE

6.2 Projects & Directories

Now that you have a general familiarity with RStudio’s working window, we can begin to bring in concepts introduced in the Files & Folders chapter.

Because R is a coding language, it is particular and case-sensitive. This is why understanding how to name and locate your files and folders is important, and why we suggest storing all of your research project files in 1 folder.

6.2.1 Working Directory

In the Files & Folders chapter we learned that a directory is a location on your computer. A working directory is the current location where you are working, or the starting point when looking for files to import or export in RStudio.

You can determine what your working directory is by executing the following code in your console

getwd()

You can also see your working directory by looking at the toolbar of the Console tab, it will be beside the R logo and version number!

INCLUDE A PICTURE HERE

It is important to remember that each nested folder represents a different directory and when you are giving RStudio a command to import or export a file, and anything outside of the working directory needs to be specified.

To help know if something is outside of your current working directory, go to the Files tab and navigate yourself so that you are in the location of your working directory. If you do not see the the file you are looking for, or have just created, you will need to include pieces of the file path to access it through RStudio!

INCLUDE A PICTURE HERE

6.2.2 Projects

A project in RStudio is essentially creating a working directory that you can easily access and transfer in/out of.

The difference being that a working directory uses the absolute file path, meaning it uses the specific location on your computer. This makes it hard to navigate code and files if you are not the original author, as you would need to make sure all instances where a file is mentioned is changed to match your working directory. A project eliminates the need for absolute file paths by making them relative. This means that when using or working in a project, the working directory is automatically set to the location of where the project file is saved.

I HAVE AN IDEA FOR AN IMAGE HERE - KINDA LIKE A PUZZLE

At this point you should already created a folder for your research project. This folder contains all of the files associated with thing project. To create an RStudio project within this folder, follow these steps:

  1. Select the File menu found in the top left of your screen next to the Edit, Code, View, etc. menus.
  2. Select New Project
  3. Select the Existing Directory option
  4. Click the Browse button and find the folder you have designated for your research project
  5. Select the Folder, then then click the Choose Folder button.
  6. Double check that the last location provided in the file path is the name of the folder for your research project
  7. Click Create Project

INCLUDE PICTURES FOR THIS

6.2.3 Project Settings

HERE IS WHERE WE WILL INCLUDE INFORMATION ON HOW TO CHANGE THE DEFAULT SETTING IN A R PROJECT - LIKE WORKSPACE IMAGES AND HISTORY

6.2.4 Using Projects

Now that you have your project created. It is important to learn how to ensure you are always working inside your project. There are multiple ways to do this.

  1. Any time you want to work on your research in RStudio, locate your research project folder on your computer and double click the project file to open it (It is the file with .Rproj extension). This will open RStudio.
  2. When RStudio is open, on the right side above the environment tab you will see a light blue icon. This tells you what project you are currently in. If you see Project: None you are not working in your project.

It is best practice to always make these 2 points the first things you do when using RStudio for your research. It is the easiest way to eliminate many common errors seen with working directories.

6.3 Scripts

As mentioned previously, the environment resets everytime a new session is started. Trying to remember your code, and how to recreate your environment can be hard. We can avoid this by using the script editor.

To access the script editor, click the File menu, select New File, then select R Script. This will now cause the interface to change to include 4 panels. When you are finished or want to save your script, you will notice that it uses the .R file extension.

INCLUDE PICTURE HERE

Your script is a way to save and actively work on code, the console is where the code is executed. RStudio automatically remembers what files were open in the script editor and has them re-opened when a new session begins!

When you are creating or working with scripts, it is best to stay on top of your organization to avoid getting confused with what each script does, or what project the script belongs too.

There are plenty of “best practices” you can follow to avoid this heartache and help keep your coding organized. It also helps you when you are coming back to a script you wrote a long time ago and need to add things too!

6.3.1 Comments

One of the most important things you can do is to comment your code. Consider a comment as notes, where you can explain what you’re doing without being marked or in this case, code. To create a comment, use the # symbol in front of what you want to type.

# This is a comment
# You can put anything in a comment
# 2 + 2

# the next line is considered code and will be executed
2 + 2
## [1] 4

6.3.2 Script Headers

Having a header at the top of your script file is a great way to help you keep your work organized. In your header you should include a title, name, project, date, etc. Basically, enough information that you can get a general idea of what the script if for without having to go through each line of code!

Here is an example that includes the script title, project (course), author, and date!

####################
## Script Set-Up
## Advanced Statistical Modelling
## Paige Levangie
## 08 January 2024
####################

6.3.3 Libraries and Sections

The next thing you will need to do is begin your section set-up. Splitting your code into sections helps maintain your workflow so that when you run your code, everything runs in the proper order!

Creating a code section is quite easy, the hardest part is picking a title! After your script header, your very first code section should be where you load all of your libraries or packages that you need in your code.

Here is an example:

#### Load Libraries ####

# Load Libraries
library(tidyverse)
library(ggeffects)

Your next code chunk is where you want to read in your data. You only want to read in the data, any wrangling should be done in another section! Note: the file is not available to you. This is just provided as an example.

#### Import Data ####

# Import research data
mydata <- read_csv("research_data.csv")

6.3.4 File validation and Structure

You always want to make sure your variables in your data make sense.

Here is how to do that!

#data structure
#make sure all variables 'make sense'
str(mydata)

# summary
summary(mydata)

You can also table up some variables to make sure they make sense

#use tables to look at data
table(mydata$common_name, useNA = "always")
table(mydata$river, useNA = "always")
table(mydata$site, useNA = "always")

#look at one variable
mydata$site

#use tables to look at conditional data
table(mydata$site, mydata$common_name)

6.3.5 Additional Sections

From this point on, any other things you do should be contained in their own sections.

When you wrangle data, you should always make sure you are not modifying the original data you are reading in, so you can can always refer back it it if you need. You can create different data objects for the different subsets or things you will be doing!

Some examples of additional sections are (but aren’t limited too): Data Wrangling, Visualization, Modelling, and Maps.

6.4 Data Import & Export

6.4.1 Importing

When importing data into a software tool, majority of the time you need to include the file path when referring to the file you want to import. This is because software tools need to know exactly where to look to find a file, they need to be directed to the location of the file.

To import a file into R, you need to use an import function. You also need to tell R exactly where to look in order to find the file.

Note: Import functions are specific to the file type you are importing

Here’s an example:

# Read in the river monitoring dataset
mydata <- read_csv("dataset_clean.csv")

Note: What did you notice about our best practices for variable and file naming? What about scripting and comments?

You can also use the import option directly within RStudio! Try to remember how we did this in class!

6.4.2 Relative folders/removal of ‘extra’ info

When importing, a good thing to do is relate file paths to driving directions. If google maps told you the order of left and right turns to get somewhere, but not the names of the streets, what good is that to you? Well, R basically needs the same thing. You can’t just name the file you want import without telling R where to look for it. If you do, it shuts down and says file or file path does not exist.

mydata <- read_csv("Users/John/Desktop/Stats/Research Project/research_data.csv")

Using Projects, R already assumes you want to start looking in that folder, so you can remove the file path part that tells R to get to your project folder!

If you have an R Project within your Research Project folder, this is all you would need to include:

mydata <- read_csv("research_data.csv")

See the difference?

6.4.3 Exporting

When you are finished in R or want to save your tables or data frames as a spreadsheet, you export is using the write_csv function.

# Export as csv
write_csv(mydata, file = "research_data_final.csv")

Does this file name follow the naming conventions we talked about, would you change it?

Another format is .RDA files. This is considered R’s file format used to store data. To import/export your dataset as a rdata file, do this:

# Export data
save(mydata, file = "research_data.rda")

# Import data
load("research_data.rda")

Note: When you load .rda files in R, they are automatically assigned a name used to refer to the data frame object!

6.5 Chapter Wrap-Up

At this point, you should have a basic understanding of how to navigate the RStudio interface, create a project and scripts, and import your data. These skills represent your foundation in R, and mastering these skills will make the majority of the work you do in R easier.

The remaining chapters will now cover different areas of data cleaning, wrangling, visualizations, and analysis using R.

6.5.1 Chapter Terms & Definitions

Here is a summary of some of the bolded terms used throughout this chapter, refer back to this list whenever you need a refresher!