简体   繁体   中英

Loop through df and create new df in R

I have a df (10 rows, 15 columns)

df<-data.frame(replicate(15,sample(0:1,10,rep=TRUE)))

I want to loop over each column, do something to each row and create a new df with the answer. I actually want to do a linear regression on each column. I get back a list for each column. For example I have a second df with what I want to put into the lm. df2<-data.frame(replicate(2,sample(0:1,10,rep=TRUE)))

I then want to do something like:

new_df <- data.frame()
for (i in 1:ncol(df)){
j<-lm(df[,i] ~ df2$X1 + df2$X2)
temp_df<-j$residuals
new_df[,i]<-cbind(new_df,temp_df)
}

I get the error:

Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 0, 8

I have checked other similar posts but they always seem to involve a function or something similarly complex for a newbie like me. Please help

Update

Based on the new example

lst1 <- lapply(names(df), function(nm) {dat <- cbind(df[nm], df2[c('X1', 'X2')])
        lm(paste0(nm,  "~ X1 + X2"), data = dat)$residuals})
out <- setNames(data.frame(lst1), names(df))

Also, this doesn't need any loop

out2 <- lm(as.matrix(df) ~ X1 + X2, data = cbind(df, df2))$residuals

Old

We can do this easily without any loop

    new_df <- df + 10

---

If we need a loop, it can be done with `lapply`

    new_df <- df
    new_df[] <- lapply(df, function(x) x + 10)

---

Or with a `for` loop

    lst1 <- vector('list', ncol(df))
    for(i in seq_along(df)) lst1[[i]] <- df[, i] + 10
    new_df <- as.data.frame(lst1)

data

set.seed(24)
df <- data.frame(replicate(15,sample(0:1,10,rep=TRUE)))
df2 <- data.frame(replicate(2,sample(0:1,10,rep=TRUE)))

This can be done without loops but for your understanding, using loops we can do

new_df <- df
for (i in names(df)) {
  j<-lm(df[,i] ~ df$X1 + df$X2)
  new_df[i] <- j$residuals
}

You are initialising an empty dataframe with 0 rows and 0 columns initially as new_df and hence when you are trying to assign the value to it, it gives you an error. Instead of that assign original df to new_df as they both are going to share the same structure and then use the above.

I would do as suggested by akrun. But if you do need (or want) to loop for some reasons you can use:

df<-data.frame(replicate(15,sample(0:1,10,rep=TRUE)))

new_df <- data.frame(replicate(15, rep(NA, 10)))

for (i in 1:ncol(df)){
new_df[ ,i] <- df[ , i] + 10
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM