[英]Is there a way in R to compute a new column on a df based on another df?
is it possible to do something like this in R (assuming both df1 and df2 have the same number of rows?是否可以在 R 中执行类似的操作(假设 df1 和 df2 具有相同的行数?
if (df1$var1 = 8) df2$var1 = 1.
if (df1$var2 = 9) df2$var2 = 1.
Here is one simple
option in base R
, where we replicate the values 8, 9 to make the lengths same and compare with the subset of columns of 'df1', resulting in a logical matrix.这是
base R
中的一个simple
选项,我们复制值 8、9 以使长度相同并与“df1”的列子集进行比较,从而产生一个逻辑矩阵。 Subset the 'df2' and assign those columns to 1子集'df2'并将这些列分配给1
nm1 <- c('var1', 'var2')
df2[nm1][df1[nm1] == c(8, 9)[col(df1[nm1])]] <- 1
df2
# var1 var2 var3
#1 5 1 1
#2 3 1 2
#3 1 3 3
#4 1 4 4
#5 4 2 5
Or this can be done in two steps或者这可以分两步完成
df2$var1[df1$var1 == 8] <- 1
df2$var2[df1$var2 == 9] <- 1
Or using Map
或使用
Map
df2[nm1] <- Map(function(x, y, z) replace(x, y == z, 1),
df2[nm1], df1[nm1], c(8, 9))
The if/else
loop can be also done, but it is not vectorized ie it expects input to be of length 1. If we do a loop, then it can be done (but would be inefficient in R
) if/else
循环也可以完成,但它不是矢量化的,即它期望输入的长度为 1。如果我们做一个循环,那么它可以完成(但在R
中效率低下)
vals <- c(8, 9)
for(i in seq_len(nrow(df1))) {
for(j in seq_along(nm1)) {
if(df1[[nm1[j]]][i] == vals[j]) df2[[nm1[j]]][i] <- 1
}
}
df1 <- data.frame(var1 = c(1, 3, 8, 5, 2), var2 = c(9, 3, 1, 8, 4),
var3 = 1:5)
df2 <- data.frame(var1 = c(5, 3, 2, 1, 4), var2 = c(3, 1, 3, 4, 2),
var3 = 1:5)
A simple two line code can be done with Base R ifelse statement使用 Base R ifelse 语句可以完成一个简单的两行代码
df1 <- data.frame(var1 = c(1:10), var2 = c(1:10))
df2 <- data.frame(var1 = c(1:10), var2 = c(1:10))
df2$var1 <- ifelse(df1$var1 == 8, 1,df2$var1)
df2$var2 <- ifelse(df1$var2 == 9, 1,df2$var2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.