How to replace several variables with several variables from another dataframe in R using a loop?

Question

I would like to replace multiple variables with variables from a second dataframe in R.

df1$var1 <- df2$var1
df1$var2 <- df2$var2

# and so on ...

As you can see the variable names are the same in both dataframes, however, numeric values are slightly different whereas the correct version is in df2 but needs to be in df1. I need to do this for many, many variables in a complex data set and wonder whether someone could help with a more efficient way to code this (possibly without using column references).

Here some example data:

# dataframe 1
var1 <- c(1:10)
var2 <- c(1:10)
df1 <- data.frame(var1,var2)

# dataframe 2
var1 <- c(11:20)
var2 <- c(11:20)
df2 <- data.frame(var1,var2)

# assigning correct values
df1$var1 <- df2$var1
df1$var2 <- df2$var2

Answer 1

As Parfait has said, the current post seems a bit too simplified to give any immediate help but I will try and summarize what you may need for something like this to work.

If the assumption is that df1 and df2 have the same number of rows AND that their orders are already matching, then you can achieve this really easily by the following subset notation:

df1[,c({column names df1}), drop = FALSE] <- df2[, c({column names df2}), drop = FALSE]

Lets say that df1 has columns a , b , and c and you want to replace b and c with two columns of df1 whose columns are x , y , z .

df1[,c("b","c"), drop = FALSE] <- df2[, c("y", "z"), drop = FALSE]

Here we are replacing b with y and c with z . The drop argument is just for added protection against subsetting a data.frame to ensure you don't get a vector.

If you do NOT know the order is correct or one data frame may have a differing size than the other BUT there is a unique identifier between the two data.frames - then I would personally use a function that is designed for merging two data frames. Depending on your preference you can use merge from base or use *_join functions from the dplyr package (my preference).

library(dplyr)
#assuming a and x are unique identifiers that can be matched.
new_df <- left_join(df1, df2, by = c("a"="x"))

How to replace several variables with several variables from another dataframe in R using a loop?

Question

1 answers

solution1
1 ACCPTED 2020-12-07 19:24:18

How to replace several variables with several variables from another dataframe in R using a loop?

Question

1 answers

solution1 1 ACCPTED 2020-12-07 19:24:18

solution1
1 ACCPTED 2020-12-07 19:24:18