[英]How to replace several variables with several variables from another dataframe in R using a loop?
I would like to replace multiple variables with variables from a second dataframe in R.我想用 R 中的第二个 dataframe 中的变量替换多个变量。
df1$var1 <- df2$var1
df1$var2 <- df2$var2
# and so on ...
As you can see the variable names are the same in both dataframes, however, numeric values are slightly different whereas the correct version is in df2 but needs to be in df1.如您所见,两个数据帧中的变量名称相同,但是,数值略有不同,而正确的版本在 df2 中,但需要在 df1 中。 I need to do this for many, many variables in a complex data set and wonder whether someone could help with a more efficient way to code this (possibly without using column references).
我需要对复杂数据集中的许多变量执行此操作,并想知道是否有人可以帮助以更有效的方式对此进行编码(可能不使用列引用)。
Here some example data:这里有一些示例数据:
# dataframe 1
var1 <- c(1:10)
var2 <- c(1:10)
df1 <- data.frame(var1,var2)
# dataframe 2
var1 <- c(11:20)
var2 <- c(11:20)
df2 <- data.frame(var1,var2)
# assigning correct values
df1$var1 <- df2$var1
df1$var2 <- df2$var2
As Parfait has said, the current post seems a bit too simplified to give any immediate help but I will try and summarize what you may need for something like this to work.正如 Parfait 所说,当前的帖子似乎有点过于简单,无法立即提供帮助,但我会尝试总结一下你可能需要什么才能让这样的事情起作用。
If the assumption is that df1
and df2
have the same number of rows AND that their orders are already matching, then you can achieve this really easily by the following subset notation:如果假设
df1
和df2
具有相同的行数并且它们的顺序已经匹配,那么您可以通过以下子集表示法非常容易地实现这一点:
df1[,c({column names df1}), drop = FALSE] <- df2[, c({column names df2}), drop = FALSE]
Lets say that df1
has columns a
, b
, and c
and you want to replace b
and c
with two columns of df1
whose columns are x
, y
, z
.假设
df1
有列a
、 b
和c
,您想用两列df1
替换b
和c
,其列是x
、 y
、 z
。
df1[,c("b","c"), drop = FALSE] <- df2[, c("y", "z"), drop = FALSE]
Here we are replacing b
with y
and c
with z
.在这里,我们将
b
替换为y
并将c
替换为z
。 The drop
argument is just for added protection against subsetting a data.frame to ensure you don't get a vector. drop
参数只是为了增加对子集 data.frame 的保护,以确保您没有得到向量。
If you do NOT know the order is correct or one data frame may have a differing size than the other BUT there is a unique identifier between the two data.frames - then I would personally use a function that is designed for merging two data frames.如果您不知道顺序是否正确,或者一个数据帧的大小可能与另一个数据帧的大小不同,但两个数据帧之间存在唯一标识符 - 那么我个人会使用专为合并两个数据帧而设计的 function。 Depending on your preference you can use
merge
from base or use *_join
functions from the dplyr
package (my preference).根据您的偏好,您可以使用来自 base 的
merge
或使用来自dplyr
package 的*_join
函数(我的偏好)。
library(dplyr)
#assuming a and x are unique identifiers that can be matched.
new_df <- left_join(df1, df2, by = c("a"="x"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.