识别 R 中 2 个数据帧的唯一值

Question

I am working with 2 data frames.我正在使用 2 个数据框。 I want to a file that outputs rows that appear in data frame 1, but do not appear in data frame 2. Here is sample data:我想要一个输出出现在数据框 1 中但不出现在数据框 2 中的行的文件。这是示例数据：

df1:
id    visit
094-1   2
094-2   3
0813-1  11
0813-3  22

df2:
id    visit
094-1   2
094-2   3
0819-2  8

This is what I want:这就是我要的：

df3:
id    visit
0819-2  8

I tried this, but it is not working.我试过这个，但它不工作。 Can someone please help?有人可以帮忙吗？

library(tidyverse)
df1 %in% df2 -> x
df2[!x,]-> df3

Answer 1

In dplyr, there is a function setdiff for this:在 dplyr 中，有一个setdiff设置差异：

df1 = data.frame(id=c("094-1","094-2","0813-1","0813-3"),visit=c(2,3,11,22))
df2 = data.frame(id=c("094-1","094-2","0819-2"),visit=c(2,3,8))

dplyr::setdiff(df2,df1)
      id visit
1 0819-2     8

Or:或者：

library(dplyr)
setdiff(df2,df1)

Answer 2

Using data.table使用data.table

library(data.table)
fsetdiff(setDT(df2), setDT(df1))
#      id visit
#1: 0819-2     8

Answer 3

base r solution using a similar approach to the code included in your question. base r解决方案，使用与您问题中包含的代码类似的方法。 This solution uses the %in% operator but reverses it when used in combination with the !此解决方案使用%in%运算符，但在与! operator.操作员。

Data:数据：

df1 <- data.frame(
  id = c("094-1", "094-2", "0813-1", "0813-3"),
  visit = c(2,3,11,22)
)

df2 <- data.frame(
  id = c("094-1", "094-2", "0819-2"),
  visit = c(2,3, 8)
)

Code:代码：

df3 <- df2[!df2$id %in% df1$id,]

Output: Output：

df3

#>       id visit
#> 3 0819-2     8

^{Created on 2020-11-29 by the reprex package (v0.3.0)}^{由代表 package (v0.3.0) 于 2020 年 11 月 29 日创建}

识别 R 中 2 个数据帧的唯一值

问题描述

3 个解决方案

解决方案1
1 2020-11-29 18:04:34

解决方案2
0 已采纳 2020-11-29 18:42:56

解决方案3
0 2020-11-29 18:51:22

识别 R 中 2 个数据帧的唯一值

问题描述

3 个解决方案

解决方案1 1 2020-11-29 18:04:34

解决方案2 0 已采纳 2020-11-29 18:42:56

解决方案3 0 2020-11-29 18:51:22

解决方案1
1 2020-11-29 18:04:34

解决方案2
0 已采纳 2020-11-29 18:42:56

解决方案3
0 2020-11-29 18:51:22