Start out by installing R and then RStudio1
Learn things that last longer - pick your battles - Learn the fundamentals2
On the workshop, we aim to go from gaining new knowledge to comprehension of the foundations of R.3
R is a powerful programming language for data analysis, statistics, visualisation and more. RStudio is the program that interacts between you and the R language. R and RStudio are two freely available software with a huge community of users and developers.4
At the end of this session you will be able to:
Our analysis should be located in a findable and accessible location. Getting used to a reusable project structure is good practice for our project data management.
Please create a folder called RProjects under the Documents folder
There are four main panels on RStudio. We will soon work on these panels, but first be shortly introduced.
Create two folders in your project
In RStudio, you can use the fourth panel, click Files then New Folder.
When in doubt of naming conventions check6.
Comments start with a hash # and follows with a single space
# Description:
# Author:
# Date:
To add a section
# Starting with calculations --------------------------
From now on, I will recommend you to add a new section for each exercise, and comments on every line.
Tip: To have a readable code, use spaces around all symbols and after commas.
To get the hang of R, we start using it as a calculator. Type 2 + 2 directly into the console panel and press enter. You should see this:
2 + 2
## [1] 4
R can calculate, but we would also like to save these results. We can store one or multiple values in objects to access them later.
When in doubt of naming conventions and style check8.
Let’s create a few objects together
# Creating a few objects --------------------------
# text should be inside double quotes
today <- "Monday"
# numbers can be small, long or with decimals
howManyPeople <- 21
# Sometimes we need to save yes or no answers,
# write TRUE or FALSE in upper case
myAnswer <- TRUE
Now, stop for a sec and have a look at the style guide9 again and discuss with your neighbour. If you are keen and there is time, feel free to change the values of the objects we just created.
To use functions we first need to learn how they work.
There are three ways to find help using RStudio10
From now on, I will encourage you to use the help for any new function you encounter.
ls
stand for?ls()
on your scriptWe had created these three objects with specific R data types
class
(2 min)We use the function c()
to combine values and create vectors
track <- c(10, 2, 5.3, 6, -25, 14) # numeric vector
track
## [1] 10.0 2.0 5.3 6.0 -25.0 14.0
You can create either a vector of characters or a vector of logicals
""
for each valueTRUE
and FALSE
as valuesThese is how the results should look
## [1] "one" "two" "three"
## [1] TRUE TRUE FALSE
Use the help to find out more about
factor
list
Can you find an example of your own data where you can use one of these structures?
Let’s introduce some data to R.
First, make sure you have a data folder!
Remember R is case sensitive
download.file(url = "http://tiny.cc/csvexample",
destfile = "data/example.csv")
mydata <- read.csv(file = "data/example.csv")
str
useful?# let's now create a plot
plot(x = mydata$M_At1, y = mydata$M_At2)
Most R packages can be installed like this: install.packages("packageName")
After installing, you need to load it using library(packageName)
. You will need to load a package for each new R session.
Then, go to the fourth panel and select the packages tab, after loading a package it should be checked.
You can also check sessionInfo()
library(ggplot2)
ggplot(data = mydata,
mapping = aes(x = M_At1, y = M_At2)) +
geom_point()
Now look at your script, look how good you are doing, and you can keep going.
There are plenty of R resources, these are only a few.
To finish up please send your anonymous feedback through this link before leaving http://tiny.cc/elixir_feedback
File close project (save your data if you want), then you can close RStudio.
This handout was written in Rmarkdown and uses the open-source style Tufte. It has been published in Github pages and also as a PDF handout.
All of the information of my courses can be found on my Github repo R for Data Analysis. These resources are freely available under the Creative Commons - Attribution Licence CC BY 4.0. You may re-use and adapt the material in any way you wish, without asking permission, provided you cite the original source. That is a link back to the website R for Data Analysis and my ORCID 0000-0002-8990-1985.
I acknowledge this publication is resulting from support of Elixir-Belgium for my role as data science and bioinformatics trainer.
Last update: 2018-02-14
See installation instructions installation.md↩
“Learning to code is a never ending journey with a set of challenges and delights unique to each person”↩
FYI: Projects make managing multiple directories straightforward↩
[The .R extension is important for R to recognise your script]↩
The help panel will show you the Documentation with examples at the end↩
You can either read the example.csv file or copy another csv file to your data folder↩
You can also read other kinds of files using read.table
or use functions from packages like readr↩