[R Course] Loops and Statements with R

R Courses

Familiarize yourself with ifelse and if…else statements. Also use For and While loops to repeat a specific block of code.

Thierry Warin https://warin.ca/aboutme.html (HEC Montréal and CIRANO (Canada))https://www.hec.ca/en/profs/thierry.warin.html
09-05-2019

Introduction

In this course, we assume you’re familiar with basic data structures, arithmetic operations, and comparison operators in R. Not quite there yet? Check out our free R Basics course.

You will discover ifelse and if...else statements as well as the loops in R. You will learn to use two types of loops: For and While.

Loops

“Looping”, “cycling”, “iterating” or just replicating instructions is an old practice that originated well before the invention of computers. It is nothing more than automating a multi-step process by organizing sequences of actions or ‘batch’ processes and by grouping the parts that need to be repeated.

Statements

Defining a choice in your code is pretty simple: If this condition is true, then carry out a certain task. Many programming languages let you do that with exactly those words: if . . . then.

Data

We will work with data from UNIDO. You can load UNIDO data stored in a Gsheet by using the following code!

library(gsheet)

dataUnido <- gsheet2tbl("https://docs.google.com/spreadsheets/d/1uLaXke-KPN28-ESPPoihk8TiXVWp5xuNGHW7w7yqLCc/edit?usp=sharing")
date country GDP section
2010 australia 1142250506 a
2011 australia 1389919156 d
2012 australia 1537477830 c
2013 australia 1563950959 e
2014 australia 1454675480 e
2015 australia 1339539063 f
2010 belgium 483577483 c
2011 belgium 526975257 e
2012 belgium 497815990 a
2013 belgium 521370528 f
2014 belgium 531234804 d
2015 belgium 454039037 d
2010 canada 1613406135 c
2011 canada 1788703386 f
2012 canada 1824288757 a
2013 canada 1837443487 a
2014 canada 1783775591 b
2015 canada 1550536520 b

Ifelse Statement

Ifelse

ifelse(condition, x, y)

Here, condition must be a logical vector (or an object that can be coerced to logical). The return value is a vector with the same length as condition.

This returned vector has element from x if the corresponding value of condition is TRUE or from y if the corresponding value of condition is FALSE.

This is to say, the result will be x if condition is TRUE else it will take the value of y.

Example 1

a = c(5,7,2,9)
ifelse(a == 2,"even","odd")
[1] "odd"  "odd"  "even" "odd" 

In the above example, the condition is a == 2 which will result into the vector (FALSE,FALSE,TRUE,FALSE).

Example 2

Let’s say we want to create a new column called Region under a condition in the data frame called dataUnido. The condition is that: IF the column country contain the word “australia” then the new column Region will contain the word “Pacific” ELSE the new column Region will contain the sentence “Rest of the World”.

dataUnido$Region <- ifelse(dataUnido$country == "australia","Pacific","Rest of the World")
date country GDP section Region
2010 australia 1142250506 a Pacific
2011 australia 1389919156 d Pacific
2012 australia 1537477830 c Pacific
2013 australia 1563950959 e Pacific
2014 australia 1454675480 e Pacific
2015 australia 1339539063 f Pacific
2010 belgium 483577483 c Rest of the World
2011 belgium 526975257 e Rest of the World
2012 belgium 497815990 a Rest of the World
2013 belgium 521370528 f Rest of the World
2014 belgium 531234804 d Rest of the World
2015 belgium 454039037 d Rest of the World
2010 canada 1613406135 c Rest of the World
2011 canada 1788703386 f Rest of the World
2012 canada 1824288757 a Rest of the World
2013 canada 1837443487 a Rest of the World
2014 canada 1783775591 b Rest of the World
2015 canada 1550536520 b Rest of the World

Ifelse Ladder

Example 1

Now we want to create a new column called Continents under multiple conditions in the data frame called dataUnido. The condition is that: IF the column country contain the word “australia” then the new column Continents will contain the word “Pacific” ELSE IF the column country contain the word “belgium” then the new column Continents will contain the word “Europe” ELSE the new column Continents will contain the word “Americas”.

dataUnido$Continents <- ifelse(dataUnido$country == "australia","Pacific", 
                               ifelse(dataUnido$country == "belgium", "Europe", "Americas"))
date country GDP section Region Continents
2010 australia 1142250506 a Pacific Pacific
2011 australia 1389919156 d Pacific Pacific
2012 australia 1537477830 c Pacific Pacific
2013 australia 1563950959 e Pacific Pacific
2014 australia 1454675480 e Pacific Pacific
2015 australia 1339539063 f Pacific Pacific
2010 belgium 483577483 c Rest of the World Europe
2011 belgium 526975257 e Rest of the World Europe
2012 belgium 497815990 a Rest of the World Europe
2013 belgium 521370528 f Rest of the World Europe
2014 belgium 531234804 d Rest of the World Europe
2015 belgium 454039037 d Rest of the World Europe
2010 canada 1613406135 c Rest of the World Americas
2011 canada 1788703386 f Rest of the World Americas
2012 canada 1824288757 a Rest of the World Americas
2013 canada 1837443487 a Rest of the World Americas
2014 canada 1783775591 b Rest of the World Americas
2015 canada 1550536520 b Rest of the World Americas

Example 2

Let’s create a column called Code under multiple conditions. In this case, IF section == a, the column Code contains the value 1, ELSE IF section == b, the column Code contains the value 2, etc… until section == e then the column Code contains the value 6.

dataUnido$Code <- ifelse(dataUnido$section == "a",1, 
                         ifelse(dataUnido$section == "b", 2, 
                                ifelse(dataUnido$section == "c", 3, 
                                       ifelse(dataUnido$section == "d", 4, 
                                              ifelse(dataUnido$section == "e", 5, 6)))))
date country GDP section Region Continents Code
2010 australia 1142250506 a Pacific Pacific 1
2011 australia 1389919156 d Pacific Pacific 4
2012 australia 1537477830 c Pacific Pacific 3
2013 australia 1563950959 e Pacific Pacific 5
2014 australia 1454675480 e Pacific Pacific 5
2015 australia 1339539063 f Pacific Pacific 6
2010 belgium 483577483 c Rest of the World Europe 3
2011 belgium 526975257 e Rest of the World Europe 5
2012 belgium 497815990 a Rest of the World Europe 1
2013 belgium 521370528 f Rest of the World Europe 6
2014 belgium 531234804 d Rest of the World Europe 4
2015 belgium 454039037 d Rest of the World Europe 4
2010 canada 1613406135 c Rest of the World Americas 3
2011 canada 1788703386 f Rest of the World Americas 6
2012 canada 1824288757 a Rest of the World Americas 1
2013 canada 1837443487 a Rest of the World Americas 1
2014 canada 1783775591 b Rest of the World Americas 2
2015 canada 1550536520 b Rest of the World Americas 2

Example 3

In this example, we create a column called Level under 3 conditions: IF the column Code contains a value equal or between 1 to 3, the column Level will be equal to “Low”. ELSE IF the column Code contains a value equal or between 4 to 5, the column Level will be equal to “Medium” ELSE the column Level will be equal to “High”.

# Example with the data from UNIDO
dataUnido$Level <- ifelse(dataUnido$Code >= 1 & dataUnido$Code <= 3, "Low", 
                         ifelse(dataUnido$Code >= 4 & dataUnido$Code <= 5, "Medium", "High"))
date country GDP section Region Continents Code Level
2010 australia 1142250506 a Pacific Pacific 1 Low
2011 australia 1389919156 d Pacific Pacific 4 Medium
2012 australia 1537477830 c Pacific Pacific 3 Low
2013 australia 1563950959 e Pacific Pacific 5 Medium
2014 australia 1454675480 e Pacific Pacific 5 Medium
2015 australia 1339539063 f Pacific Pacific 6 High
2010 belgium 483577483 c Rest of the World Europe 3 Low
2011 belgium 526975257 e Rest of the World Europe 5 Medium
2012 belgium 497815990 a Rest of the World Europe 1 Low
2013 belgium 521370528 f Rest of the World Europe 6 High
2014 belgium 531234804 d Rest of the World Europe 4 Medium
2015 belgium 454039037 d Rest of the World Europe 4 Medium
2010 canada 1613406135 c Rest of the World Americas 3 Low
2011 canada 1788703386 f Rest of the World Americas 6 High
2012 canada 1824288757 a Rest of the World Americas 1 Low
2013 canada 1837443487 a Rest of the World Americas 1 Low
2014 canada 1783775591 b Rest of the World Americas 2 Low
2015 canada 1550536520 b Rest of the World Americas 2 Low

Ifelse and With

Example

With this code we create a new column called Specifics under multiple conditions but this time with multiple columns. The column Specifics will contain the word “Yes” IF the column country is equal to “canada”, “belgium” AND IF section is equal to “a” AND IF date is equal or greater than 2012 ELSE the column Specifics will contain the word “No”.

dataUnido$Specifics <- with(dataUnido, ifelse(country %in% c("canada", "belgium") & 
                                               section == "a" &
                                               date >= 2012,
                                               "Yes", "No"))
date country GDP section Region Continents Code Level Specifics
2010 australia 1142250506 a Pacific Pacific 1 Low No
2011 australia 1389919156 d Pacific Pacific 4 Medium No
2012 australia 1537477830 c Pacific Pacific 3 Low No
2013 australia 1563950959 e Pacific Pacific 5 Medium No
2014 australia 1454675480 e Pacific Pacific 5 Medium No
2015 australia 1339539063 f Pacific Pacific 6 High No
2010 belgium 483577483 c Rest of the World Europe 3 Low No
2011 belgium 526975257 e Rest of the World Europe 5 Medium No
2012 belgium 497815990 a Rest of the World Europe 1 Low Yes
2013 belgium 521370528 f Rest of the World Europe 6 High No
2014 belgium 531234804 d Rest of the World Europe 4 Medium No
2015 belgium 454039037 d Rest of the World Europe 4 Medium No
2010 canada 1613406135 c Rest of the World Americas 3 Low No
2011 canada 1788703386 f Rest of the World Americas 6 High No
2012 canada 1824288757 a Rest of the World Americas 1 Low Yes
2013 canada 1837443487 a Rest of the World Americas 1 Low Yes
2014 canada 1783775591 b Rest of the World Americas 2 Low No
2015 canada 1550536520 b Rest of the World Americas 2 Low No

If…else Statement

if statement

if (condition){
  statement
}

If the condition is TRUE, the statement gets executed. But if it’s FALSE, nothing happens.

Here, condition can be a logical or numeric vector, but only the first element is taken into consideration.

In the case of numeric vector, zero is taken as FALSE, rest as TRUE.

Example

i <- 5
if(i > 0){
  print("Positive number")
}
[1] "Positive number"

if…else statement

if (condition) {
  statement1
} else {
  statement2
}

The else part is optional and is only evaluated if condition is FALSE.

It is important to note that else must be in the same line as the closing braces of the if statement.

Example 1

i <- 2
if (i > 3){
  print("Yes")
} else {
  print("No")
}
[1] "No"

Example 2

The above conditional can also be written in a single line as follows.

if (i > 3) print("Yes") else print("No")
[1] "No"

Example 3

This feature allows you to write construct as shown below.

x <- -3
if(x > 0) 5 else 6
[1] 6

if…else Ladder

The if…else ladder (if…else…if) statement allows you execute a block of code among more than 2 alternatives.

The syntax of if…else statement is:

if (condition1) {
  statement1
} else if (condition2) {
  statement2
} else if (condition3) {
  statement3
} else {
  statement4
}

Only one statement will get executed depending upon the conditions.

Example 1

x <- 0
if (x < 0) {
  print("Negative number")
} else if (x > 0) {
  print("Positive number")
} else
  print("Zero")
[1] "Zero"

For Loop

Loops are used in programming to repeat a specific block of code. In this section, you will learn to create a For loop. A For loop is used to iterate over a vector in R programming.

for (variable in sequence){
  statement
}

Here, sequence is a vector and variable takes on each of its value during the loop. In each iteration, statement is evaluated.

Example 1

for (i in 1:4){
  j <- i + 10
  print(j)
}
[1] 11
[1] 12
[1] 13
[1] 14

In the above example, the loop iterates 4 times as the sequence is now 1:4 which means 1,2,3,4. It could be written as you learned above c(1,2,3,4) instead of 1:4.

In each iteration, variable (i) takes on the value of corresponding element of sequence (1:4).

We have used the print() function to show the result of i + 10 stored in j.

While Loop

while (condition){
  statement
}

Here, condition is evaluated and the body of the loop is entered if the result is TRUE.

The statements inside the loop are executed and the flow returns to evaluate the condition again.

This is repeated each time until condition evaluates to FALSE, in which case, the loop exits.

Example

i <- 1
while (i < 5){
  print(i)
  i = i + 1
}
[1] 1
[1] 2
[1] 3
[1] 4

In the above example, i is initially initialized to 1.

Here, the condition is i < 5 which evaluates to TRUE since 1 is less than 5. So, the body of the loop is entered and i is printed and incremented.

Incrementing i is important as this will eventually meet the exit condition. Failing to do so will result into an infinite loop.

In the next iteration, the value of i is 2 and the loop continues.

This will continue until i takes the value 5. The condition 5 < 5 will give FALSE and the while loop finally exits.


Citation

For attribution, please cite this work as

Warin (2019, Sept. 5). Thierry Warin, PhD: [R Course] Loops and Statements with R. Retrieved from https://warin.ca/posts/rcourse-loops-and-statements/

BibTeX citation

@misc{warin2019[r,
  author = {Warin, Thierry},
  title = {Thierry Warin, PhD: [R Course] Loops and Statements with R},
  url = {https://warin.ca/posts/rcourse-loops-and-statements/},
  year = {2019}
}