在 R 中的两列中查找并删除重复的观察值

Question

I have an example data set like this:我有一个这样的示例数据集：


df1 <- data.frame(c1=c('a','b','c','d','e','f','g', 'h'),
         c2=c('l','m','a','g','e','q','a','d'))

and I just want a data frame that removed the duplicates between c1 and c2.我只想要一个删除 c1 和 c2 之间重复项的数据框。 I already know how to grab the unique elements from c1 and c2, but what do I do after that, to end up with something like the following:我已经知道如何从 c1 和 c2 中获取独特的元素，但是在那之后我该怎么做，最终得到如下内容：

data.frame(c1=c(b,c,f,h),c2=c(l,m,q,NA))

Answer 1

An option is to get the intersect ing elements with Reduce , remove those elements from each column with %in% and !一个选项是使用Reduce获取intersect元素，使用%in%和!从每列中删除这些元素! , and then pad NA at the end ，然后在末尾填充NA

v1 <- Reduce(intersect, df1)
lst1 <- lapply(df1, function(x) x[!x %in% v1])
data.frame(lapply(lst1, `length<-`, max(lengths(lst1))))
#  c1   c2
#1  b    l
#2  c    m
#3  f    q
#4  h <NA>

data数据

df1 <- data.frame(c1=c('a','b','c','d','e','f','g', 'h'),
         c2=c('l','m','a','g','e','q','a','d'))

Answer 2

One-liner in base : base单衬：

sapply(list(df1$c1[!df1$c1%in%df1$c2], 
            df1$c2[!df1$c2%in%df1$c1]), '[', 1:length(setdiff(df1$c1, df1$c2)))

#     [,1] [,2]
# [1,] "b"  "l" 
# [2,] "c"  "m" 
# [3,] "f"  "q" 
# [4,] "h"  NA

在 R 中的两列中查找并删除重复的观察值

问题描述

2 个解决方案

解决方案1
3 已采纳 2019-12-17 21:30:16

data数据

解决方案2
2 2019-12-17 21:44:06

在 R 中的两列中查找并删除重复的观察值

问题描述

2 个解决方案

解决方案1 3 已采纳 2019-12-17 21:30:16

data数据

解决方案2 2 2019-12-17 21:44:06

解决方案1
3 已采纳 2019-12-17 21:30:16

解决方案2
2 2019-12-17 21:44:06