从另一个 dataframe 中删除数据框的行，但在 R 中保持重复

Question

I'm working in R and I have two dataframes, one is the base dataframe, and another has the rows that i need to remove from the base one.我在 R 工作，我有两个数据帧，一个是基础 dataframe，另一个是我需要从基础数据帧中删除的行。 But I can't use setdiff() function, because it removes duplicated rows.但我不能使用setdiff() function，因为它会删除重复的行。 Here's an example:这是一个例子：

a <- data.frame(var1 = c(1, NA, 2, 2, 3, 4, 5),
                var2 = c(1, 7, 2, 2, 3, 4, 5))

b <- data.frame(id = c(2, 4),
                numero = c(2, 4))

And the result must be:结果必须是：

id numero
1 1
NA 7
2 2
3 3
5 5

It must be an efficient algorithm, too, because the base dataframe has 3 million rows with 26 columns.它也一定是一种高效的算法，因为基数 dataframe 有 300 万行和 26 列。

Answer 1

We may need to create a sequence column before joining我们可能需要在加入之前创建一个序列列

library(data.table)
setDT(a)[, rn := rowid(var1, var2)][!setDT(b)[, 
    rn:= rowid(id, numero)], on = .(var1 = id, var2 = numero, rn)][, 
     rn := NULL][]

-output -输出

   var1  var2
   <num> <num>
1:     1     1
2:    NA     7
3:     2     2
4:     3     3
5:     5     5

从另一个 dataframe 中删除数据框的行，但在 R 中保持重复

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-04-13 15:15:08

从另一个 dataframe 中删除数据框的行，但在 R 中保持重复

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-04-13 15:15:08

解决方案1
0 已采纳 2022-04-13 15:15:08