简体   繁体   中英

Using a list to store results of a double loop (for-loop) in R

I want to make calculations for elements of individual rows using a for-loop. I have two data.frames

  1. df: contains data of all trading-days stocks
  2. events: contains data of only event days of stocks

Even though there might be a much easier approach for this specific example, I'd like to know how to do such a task with a loop in a loop (for-loops).

First, my data.frames:

comp1 <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
date1 <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
ret <- c(1.2,2.2,-0.5,0.98,0.73,-1.3,-0.02,0.3,1.1,2.0,1.9,-0.98,1.45,1.71,0.03)
df <- data.frame(comp1,date1,ret)
comp2 <- c(1,1,2,2,2,3,3)
date2 <- c(2,4,1,2,5,4,5)
q <- paste("")
events <- data.frame(comp2,date2,q)

df

#    comp1 date1   ret
# 1      1     1  1.20
# 2      1     2  2.20
# 3      1     3 -0.50
# 4      1     4  0.98
# 5      1     5  0.73
# 6      2     1 -1.30
# 7      2     2 -0.02
# 8      2     3  0.30
# 9      2     4  1.10
# 10     2     5  2.00
# 11     3     1  1.90
# 12     3     2 -0.98
# 13     3     3  1.45
# 14     3     4  1.71
# 15     3     5  0.03

events

#   comp2 date2 q
# 1     1     2  
# 2     1     4  
# 3     2     1  
# 4     2     2  
# 5     2     5  
# 6     3     4  
# 7     3     5  

I want to make calculations of df$ret. As an example let's just take 2 * df$ret. The results for each event-day should be stored in mylist. The final output should be the data.frame "events" with a column "q" where I want the results of the calculation to be stored.

# important objects:
companies <- as.vector(unique(df$comp1)) # all the companies (here: 1, 2, 3)
days <- as.vector(unique(df$date1)) # all the trading-days (here: 1, 2, 3, 4, 5)
mylist <- vector('list', length(companies)) # a list where the results should be stored for each company

I came up with some piece of code which doesn't work. But I still think it should look something like this:

for(i in 1:nrow(events)) {
  events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
  df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i

  for(j in 1:nrow(df_k)) {
    events_k[j, "q"] <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2


  }
  mylist[i] <- events_k   
}

I don't understand how to set up the loop inside the other loop and how to store the results in mylist. Any help appreciated!!

Thank you!

Don't feel bad. All of your problems are common R gotchas. First, try changing

events <- data.frame(comp2,date2,q,stringsAsFactors=FALSE)

earlier instead. Your column q is being converted to a factor implicitly, disallowing the arithmetic * 2 operation later.

Next, let's consider the fixed loop

for(i in 1:nrow(events)) {
  events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
  df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i

  for(j in 1:nrow(df_k)) {
    events_k[j, "q"] <-
      if (0 == length(tmp <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2)) NA
      else tmp
  }
  mylist[[i]] <- events_k
}

Your first problem was on the last line, where you used [ instead of [[ (in R, the former means always wrapped with a list, whereas the latter actually accessed the value in the list).

Your second problem is that sometimes which(days==events_k[j,"date2"]) is numeric(0) (ie, there is no matching event date). The code will then work, but you'll still have a lot of dataframes with NA s. To remove those, you could do something like:

mylist <- Filter(function(df) nrow(df) > 0,
  lapply(mylist, function(df) df[apply(df, 1, function(row) !all(is.na(row))), ]))

which will filter out list elements with empty dataframes, and rows in dataframes with all NA .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM