Using a list to store results of a double loop (for-loop) in R

Question

I want to make calculations for elements of individual rows using a for-loop. I have two data.frames

df: contains data of all trading-days stocks
events: contains data of only event days of stocks

Even though there might be a much easier approach for this specific example, I'd like to know how to do such a task with a loop in a loop (for-loops).

First, my data.frames:

comp1 <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
date1 <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
ret <- c(1.2,2.2,-0.5,0.98,0.73,-1.3,-0.02,0.3,1.1,2.0,1.9,-0.98,1.45,1.71,0.03)
df <- data.frame(comp1,date1,ret)
comp2 <- c(1,1,2,2,2,3,3)
date2 <- c(2,4,1,2,5,4,5)
q <- paste("")
events <- data.frame(comp2,date2,q)

df

#    comp1 date1   ret
# 1      1     1  1.20
# 2      1     2  2.20
# 3      1     3 -0.50
# 4      1     4  0.98
# 5      1     5  0.73
# 6      2     1 -1.30
# 7      2     2 -0.02
# 8      2     3  0.30
# 9      2     4  1.10
# 10     2     5  2.00
# 11     3     1  1.90
# 12     3     2 -0.98
# 13     3     3  1.45
# 14     3     4  1.71
# 15     3     5  0.03

events

#   comp2 date2 q
# 1     1     2  
# 2     1     4  
# 3     2     1  
# 4     2     2  
# 5     2     5  
# 6     3     4  
# 7     3     5

I want to make calculations of df$ret. As an example let's just take 2 * df$ret. The results for each event-day should be stored in mylist. The final output should be the data.frame "events" with a column "q" where I want the results of the calculation to be stored.

# important objects:
companies <- as.vector(unique(df$comp1)) # all the companies (here: 1, 2, 3)
days <- as.vector(unique(df$date1)) # all the trading-days (here: 1, 2, 3, 4, 5)
mylist <- vector('list', length(companies)) # a list where the results should be stored for each company

I came up with some piece of code which doesn't work. But I still think it should look something like this:

for(i in 1:nrow(events)) {
  events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
  df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i

  for(j in 1:nrow(df_k)) {
    events_k[j, "q"] <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2


  }
  mylist[i] <- events_k   
}

I don't understand how to set up the loop inside the other loop and how to store the results in mylist. Any help appreciated!!

Thank you!

Answer 1

Don't feel bad. All of your problems are common R gotchas. First, try changing

events <- data.frame(comp2,date2,q,stringsAsFactors=FALSE)

earlier instead. Your column q is being converted to a factor implicitly, disallowing the arithmetic * 2 operation later.

Next, let's consider the fixed loop

for(i in 1:nrow(events)) {
  events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
  df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i

  for(j in 1:nrow(df_k)) {
    events_k[j, "q"] <-
      if (0 == length(tmp <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2)) NA
      else tmp
  }
  mylist[[i]] <- events_k
}

Your first problem was on the last line, where you used [ instead of [[ (in R, the former means always wrapped with a list, whereas the latter actually accessed the value in the list).

Your second problem is that sometimes which(days==events_k[j,"date2"]) is numeric(0) (ie, there is no matching event date). The code will then work, but you'll still have a lot of dataframes with NA s. To remove those, you could do something like:

mylist <- Filter(function(df) nrow(df) > 0,
  lapply(mylist, function(df) df[apply(df, 1, function(row) !all(is.na(row))), ]))

which will filter out list elements with empty dataframes, and rows in dataframes with all NA .

Using a list to store results of a double loop (for-loop) in R

Question

1 answers

solution1
2 2014-04-03 18:54:38

Using a list to store results of a double loop (for-loop) in R

Question

1 answers

solution1 2 2014-04-03 18:54:38

solution1
2 2014-04-03 18:54:38