[英]Validating values in column between two data frames in R based on conditions
I have two data frames. 我有两个数据框。 I have to match the first two columns between nndf
and tndf
and if there is a match i have to check if the values in the third column is the same or not and update the third dataframe. 我必须匹配nndf
和tndf
之间的前两列,如果有匹配项,我必须检查第三列中的值是否相同并更新第三数据帧。 The problem is nndf
is longer than tndf
. 问题是nndf
比tndf
。
nndf <- data.frame("var1" = c("ABC","ABC","DEF", "FED","DGS"), "var2" = c("xyz","abc","def","dsf","dsf"), "var3" = c(1234.21,3432.12,0.12,1232.44,873.00))
tndf <- data.frame("var1" = c("ABC","ABC","DEF"), "var2" = c("xyz","abc","def"), "var3" = c(1234.21,3432.12,0.11))
ndf <- data.frame("var1" = c("ABC","ABC"), "var2" = c("xyz","abc"))
I want to populate the results in the third data frame. 我想在第三个数据框中填充结果。 This data frame takes the common values from the first two columns of nndf
and tndf
and wherever they are common check the third column which is 1234.21
and 3432.12
and if the values are same, it returns TRUE and fill the column. 此数据帧从nndf
和tndf
的前两列中nndf
公共值,并且在有共同之处的地方检查第三列1234.21
和3432.12
,如果值相同,则返回TRUE并填充该列。 The desired output is 所需的输出是
var1 var2 var3
ABC xyz TRUE (indicating 1234.21 and 1234.21 in first two df are same)
ABC abc TRUE
DEF def FALSE (indicating 0.12 is not equal to 0.11)
I tried using forloop + if condition
. 我尝试使用forloop + if condition
。 However it iterates through each line multiple times and fills the results. 但是,它会反复遍历每行并填充结果。
We could do an inner_join
and then compare the values in two columns 我们可以做一个inner_join
然后比较两列中的值
library(dplyr)
inner_join(nndf, tndf, by = c("var1", "var2")) %>%
mutate(var3 = var3.x == var3.y) %>%
dplyr::select(var1, var2, var3)
# var1 var2 var3
#1 ABC xyz TRUE
#2 ABC abc TRUE
#3 DEF def FALSE
Or similarly in base R 或类似地在基数R中
df1 <- merge(nndf, tndf, by = c("var1", "var2"))
df1$var3 <- df1$var3.x == df1$var3.y
We can use %in%
in base R
to create the logical vector 我们可以在base R
使用%in%
来创建逻辑向量
tndf$var3 <- do.call(paste, tndf) %in% do.call(paste, nndf)
tndf
# var1 var2 var3
#1 ABC xyz TRUE
#2 ABC abc TRUE
#3 DEF def FALSE
Or using a join 或使用联接
library(data.table)
setDT(tndf)[nndf, var3n := var3 == i.var3, on = .(var1, var2)]
tndf[, .(var1, var2, var3 = var3n)]
# var1 var2 var3
#1: ABC xyz TRUE
#2: ABC abc TRUE
#3: DEF def FALSE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.