简体   繁体   English

根据条件验证R中两个数据帧之间的列中的值

[英]Validating values in column between two data frames in R based on conditions

I have two data frames. 我有两个数据框。 I have to match the first two columns between nndf and tndf and if there is a match i have to check if the values in the third column is the same or not and update the third dataframe. 我必须匹配nndftndf之间的前两列,如果有匹配项,我必须检查第三列中的值是否相同并更新第三数据帧。 The problem is nndf is longer than tndf . 问题是nndftndf

nndf <- data.frame("var1" = c("ABC","ABC","DEF", "FED","DGS"), "var2" = c("xyz","abc","def","dsf","dsf"), "var3" = c(1234.21,3432.12,0.12,1232.44,873.00))

tndf <- data.frame("var1" = c("ABC","ABC","DEF"), "var2" = c("xyz","abc","def"), "var3" = c(1234.21,3432.12,0.11))

ndf <- data.frame("var1" = c("ABC","ABC"), "var2" = c("xyz","abc"))

I want to populate the results in the third data frame. 我想在第三个数据框中填充结果。 This data frame takes the common values from the first two columns of nndf and tndf and wherever they are common check the third column which is 1234.21 and 3432.12 and if the values are same, it returns TRUE and fill the column. 此数据帧从nndftndf的前两列中nndf公共值,并且在有共同之处的地方检查第三列1234.213432.12 ,如果值相同,则返回TRUE并填充该列。 The desired output is 所需的输出是

var1   var2    var3
ABC    xyz     TRUE (indicating 1234.21 and 1234.21 in first two df are same)
ABC    abc     TRUE
DEF    def     FALSE (indicating 0.12 is not equal to 0.11)

I tried using forloop + if condition . 我尝试使用forloop + if condition However it iterates through each line multiple times and fills the results. 但是,它会反复遍历每行并填充结果。

We could do an inner_join and then compare the values in two columns 我们可以做一个inner_join然后比较两列中的值

library(dplyr)

inner_join(nndf, tndf, by = c("var1", "var2")) %>%
   mutate(var3 = var3.x == var3.y) %>%
   dplyr::select(var1, var2, var3)


#  var1 var2  var3
#1  ABC  xyz  TRUE
#2  ABC  abc  TRUE
#3  DEF  def FALSE

Or similarly in base R 或类似地在基数R中

df1 <- merge(nndf, tndf, by = c("var1", "var2"))
df1$var3 <- df1$var3.x == df1$var3.y

We can use %in% in base R to create the logical vector 我们可以在base R使用%in%来创建逻辑向量

tndf$var3 <- do.call(paste, tndf) %in% do.call(paste, nndf)
tndf
#  var1 var2  var3
#1  ABC  xyz  TRUE
#2  ABC  abc  TRUE
#3  DEF  def FALSE

Or using a join 或使用联接

library(data.table)
setDT(tndf)[nndf, var3n := var3 == i.var3, on = .(var1, var2)]
tndf[, .(var1, var2, var3 = var3n)]
#   var1 var2  var3
#1:  ABC  xyz  TRUE
#2:  ABC  abc  TRUE
#3:  DEF  def FALSE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM