I want to make calculations for elements of individual rows using a for-loop. I have two data.frames
Even though there might be a much easier approach for this specific example, I'd like to know how to do such a task with a loop in a loop (for-loops).
First, my data.frames:
comp1 <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
date1 <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
ret <- c(1.2,2.2,-0.5,0.98,0.73,-1.3,-0.02,0.3,1.1,2.0,1.9,-0.98,1.45,1.71,0.03)
df <- data.frame(comp1,date1,ret)
comp2 <- c(1,1,2,2,2,3,3)
date2 <- c(2,4,1,2,5,4,5)
q <- paste("")
events <- data.frame(comp2,date2,q)
df
# comp1 date1 ret
# 1 1 1 1.20
# 2 1 2 2.20
# 3 1 3 -0.50
# 4 1 4 0.98
# 5 1 5 0.73
# 6 2 1 -1.30
# 7 2 2 -0.02
# 8 2 3 0.30
# 9 2 4 1.10
# 10 2 5 2.00
# 11 3 1 1.90
# 12 3 2 -0.98
# 13 3 3 1.45
# 14 3 4 1.71
# 15 3 5 0.03
events
# comp2 date2 q
# 1 1 2
# 2 1 4
# 3 2 1
# 4 2 2
# 5 2 5
# 6 3 4
# 7 3 5
I want to make calculations of df$ret. As an example let's just take 2 * df$ret. The results for each event-day should be stored in mylist. The final output should be the data.frame "events" with a column "q" where I want the results of the calculation to be stored.
# important objects:
companies <- as.vector(unique(df$comp1)) # all the companies (here: 1, 2, 3)
days <- as.vector(unique(df$date1)) # all the trading-days (here: 1, 2, 3, 4, 5)
mylist <- vector('list', length(companies)) # a list where the results should be stored for each company
I came up with some piece of code which doesn't work. But I still think it should look something like this:
for(i in 1:nrow(events)) {
events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i
for(j in 1:nrow(df_k)) {
events_k[j, "q"] <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2
}
mylist[i] <- events_k
}
I don't understand how to set up the loop inside the other loop and how to store the results in mylist. Any help appreciated!!
Thank you!
Don't feel bad. All of your problems are common R gotchas. First, try changing
events <- data.frame(comp2,date2,q,stringsAsFactors=FALSE)
earlier instead. Your column q
is being converted to a factor implicitly, disallowing the arithmetic * 2
operation later.
Next, let's consider the fixed loop
for(i in 1:nrow(events)) {
events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i
for(j in 1:nrow(df_k)) {
events_k[j, "q"] <-
if (0 == length(tmp <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2)) NA
else tmp
}
mylist[[i]] <- events_k
}
Your first problem was on the last line, where you used [
instead of [[
(in R, the former means always wrapped with a list, whereas the latter actually accessed the value in the list).
Your second problem is that sometimes which(days==events_k[j,"date2"])
is numeric(0)
(ie, there is no matching event date). The code will then work, but you'll still have a lot of dataframes with NA
s. To remove those, you could do something like:
mylist <- Filter(function(df) nrow(df) > 0,
lapply(mylist, function(df) df[apply(df, 1, function(row) !all(is.na(row))), ]))
which will filter out list elements with empty dataframes, and rows in dataframes with all NA
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.