简体   繁体   English

条件为“如果(特定变量)中的任何一个等于”,则删除R中的行数据帧

[英]Deleting rows dataframe in R conditional to “if any of (a specific variable) is equal to”

I have been struggling for some time now with this code... I have this vector of unique ID "EID" of length 821 extracted from one of my dataframe (skate). 我已经为此代码苦苦挣扎了一段时间……我从一个数据帧(溜冰鞋)中提取了唯一ID为“ EID”的矢量,其长度为821。 It looks like this: 看起来像这样:

> head(skate$EID)
[1] "896-19" "895-8"  "899-1"  "899-5"  "899-8"  "895-7" 

I would like to remove the complete rows in another dataframe (t5) if any of the t5$EID is equal (a duplicate) of skate$EID. 如果任何t5 $ EID等于(重复)skate $ EID,我想删除另一个数据帧(t5)中的完整行。

I was able to get my 'duplicated' dataframe in t5 of all my matching EID as follow: 我能够在t5的所有匹配EID中获得“重复的”数据帧,如下所示:

> xx<-skate$EID
> t5[match(xx,t5[,26]), ]#gives me a dataframe of all matching EID in skate$EID
       record.t trip set month stratum NAFO unit.area time dur.set distance
8948          5  896  19    11     221   2J       N12  908      15        8
8849          5  895   8    10     766   3O       R36 1650      16        8
9289          5  899   1    12     743   3L       V26 2052      15        8
9299          5  899   5    12     746   3L       W27 1129      14        7

Where t5[,26] correspond to t5$EID column. 其中t5 [,26]对应于t5 $ EID列。 I'm sure it's simple, but I'm not sure how to remove all of these now from my t5 dataframe! 我敢肯定这很简单,但是我不确定现在如何从t5数据框中删除所有这些! Tips would be very much appreciated! 提示将不胜感激! Thank you! 谢谢!

There are many ways to do this. 有很多方法可以做到这一点。 To test for elements of vector A not in vector B, you can use a combination of ! 要测试矢量A中不在矢量B中的元素,可以使用!的组合! , R's logical negation operator (see ?"!" ) and %in% (see ?%in% ). ,R的逻辑否定运算符(请参阅?"!" )和%in% (请参阅?%in% )。 You then use the results of that test to indicate which rows to keep. 然后,您可以使用该测试的结果来指示要保留的行。

# Create two example data.frames
skate <- data.frame(EID = c("896-19", "895-8", "899-1", "899-5"), 
                    score = 1:4)
t5 <- data.frame(EID = c("896-19", "camel", "899-1", "goat", "899-1"), 
                 score = 105:101)

# Method 1
t5[!t5$EID %in% skate$EID, ] 

# Method 2 (using the very handy subset() function)
subset(t5, !EID %in% skate$EID)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM