Looping

Loops are used to iterate (repeat) an R statement (or set of statements). They’re implemented in three ways, for(), while(), and repeat(), but the most often used are for() loops:

for() # Repeat a set of statements a specified number of times
while() # Repeat a set of statements as long as a specified condition is met
repeat # Repeat a set of statements until a break command is encountered

Two other commands, break and next, are used, respectively, to terminate a loop’s iterations and to skip ahead to the next iteration:

break # Terminate a loops iterations
next # Skip ahead to the next iteration

Here’s an example in which each of the three loop types, for(), while(), and repeat, are used to perform a simple task, namely printing the numbers 1^2; 2^2; …; 5^2 to the screen:

for(i in 1:5) {
  print(i^2)
}
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
i <- 1
while(i <= 5) {
  print(i^2)
  i <- i + 1
}
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
i <- 1
repeat {
  print(i^2)
  i <- i + 1
  if(i > 5) break
}
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25

for() Loops

for() loops are used when we know in advance how many iterations the loop should perform. The general form of a for() loop is:

for(i in sequence) {
  statement1
  statement2
  .
  .
  .
  statementq
}

where sequence is a vector, i (whose name you’re free to change) assumes the values in sequence one after another, each time triggering another iteration of the loop during which statements 1 through q are executed. The statements usually involve the variable i.

Here’s an example. Suppose we have the data frame describing someone’s coin collection:

coins <- data.frame(Coin = c("penny", "quarter", "nickel", "quarter", "dime", "penny"),
                    Year = c(1943, 1905, 1889, 1960, 1937, 1900),
                    Mint = c("Den", "SF", "Phil", "Den", "SF", "Den"),
                    Condition = c("good", "fair", "excellent", "good", "poor", "good"),
                    Value = c(12.00, 55.00, 300.00, 40.00, 18.00, 28.00),
                    Price = c(15.00, 45.00, 375.00, 25.00, 20.00, 20.00))
coins
     Coin Year Mint Condition Value Price
1   penny 1943  Den      good    12    15
2 quarter 1905   SF      fair    55    45
3  nickel 1889 Phil excellent   300   375
4 quarter 1960  Den      good    40    25
5    dime 1937   SF      poor    18    20
6   penny 1900  Den      good    28    20

If we type:

colMeans(coins)
Error in colMeans(coins): 'x' must be numeric

we get an error message because some of the columns are non-numeric. We can compute the means of the numeric columns by looping over the columns, each time checking whether it’s numeric before computing it’s mean:

means <- NULL
for(i in 1:ncol(coins)) {
  if (is.numeric(coins[ , i])) {
    means <- c(means, mean(coins[ , i]))
  }
}

The result is:

means
[1] 1922.33333   75.50000   83.33333

Looping Over List Elements

In the next example, we loop over the elements of a list, printing a list element and recording it’s length during each iteration:

myList <- list(
  w = c(4, 4, 5, 5, 6, 6),
  x = c("a", "b", "c"),
  y = c(5, 10, 15),
  z = c("r", "s", "t", "u", "v")
)

lengths <- NULL

for(i in myList) {
  print(i)
  lengths <- c(lengths, length(i))
}
[1] 4 4 5 5 6 6
[1] "a" "b" "c"
[1]  5 10 15
[1] "r" "s" "t" "u" "v"
lengths
[1] 6 3 3 5

These examples are very simple, but looping is a very powerful programming structure for automating analyses, or data processing.

In the next chapter we will look at the apply() family of functions, that have been designed for applying functions to a data set in several convenient ways.