简体   繁体   English

提取第一data.frame中的行,而在第二data.frame中找不到某些值

[英]Extract rows in first data.frame for which certain values are not found in second data.frame

I'm trying to eliminate from a first dataframe all of the rows for which a certain value is in a second dataframe. 我试图从第一个数据帧中消除在第二个数据帧中具有特定值的所有行。

Using the R programming language for statistical data analysis. 使用R编程语言进行统计数据分析。

This is the first question I post here, so bear with me if you please ;) 这是我在这里发布的第一个问题,请根据需要忍受;)

I work with confidential data, so I recreated the problem with an example. 我使用机密数据,因此我举一个例子来重新创建问题。

Name=c("Bussieres", "Nelson")
Fname=c("Paul", "Robert")
Tel=c(123,234)
comp1=data.frame(Name, Fname, Tel)

Name=c("Bussieres","Bussieres","Nelson","Nelson")
Fname=c("Robert","Paul","Paul","Paula")
Tel=c(123,234,345,456)
comp2=data.frame(Name, Fname, Tel)

comp1 returns: comp1返回:

   Name Fname Tel
1 Bussieres   Paul 123
2    Nelson Robert 234

comp2 returns: comp2返回:

   Name Fname Tel
1 Bussieres Robert 123
2 Bussieres   Paul 234
3    Nelson   Paul 345
4    Nelson  Paula 456

Now, what I want is to return the rows of comp1 for which "Name" and "Fname" are not identical in comp2. 现在,我要返回的是comp1中comp2中“名称”和“ Fname”不相同的行。

The expected return, to be stored in a new dataframe comp3, would be ( slight edit done here, posted erronous expected results ): 将存储在新数据框comp3中的预期收益为( 此处进行了少量编辑,并发布了错误的预期结果 ):

   Name Fname Tel
1    Nelson Robert 234

My first attempts were with using the match function, but that didn't quite work. 我的初次尝试是使用match函数,但效果并不理想。

The following attempt at a for loop also didn't work. 下面的for循环尝试也没有用。

for (i in comp1[,"Name"]){for (j in comp3[,"Name"]){if i!=j return comp3=x1["Name"==i,]}}

I'm surprised that I can't find basic (primitive) functions in R to do this, as excluding certain observations from a data set would be a very routine procedure. 我很惊讶我无法在R中找到基本的(原始)函数来执行此操作,因为从数据集中排除某些观察值是非常常规的过程。

A data.table solution: 数据data.table解决方案:

require(data.table)
dt1 <- data.table(comp1, key=c("Name", "Fname"))
dt2 <- data.table(comp2, key=c("Name", "Fname"))
dt1[!dt2]

#      Name  Fname Tel
# 1: Nelson Robert 234

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM