从data.table中删除满足条件的行

Question

I have a data table 我有一个数据表

DT <- data.table(col1=c("a", "b", "c", "c", "a"), col2=c("b", "a", "c", "a", "b"), condition=c(TRUE, FALSE, FALSE, TRUE, FALSE))

   col1 col2 condition
1:    a    b      TRUE
2:    b    a     FALSE
3:    c    c     FALSE
4:    c    a      TRUE
5:    a    b     FALSE

and would like to remove rows on the following conditions: 并希望在以下情况下删除行：

each row for which condition==TRUE (rows 1 and 4) condition==TRUE每一行（第1行和第4行）
each row that has the same values for col1 and col2 as a row for which the condition==TRUE (that is row 5, col1=a, col2=b) col1和col2的值与condition==TRUE行（即第5行，col1 = a，col2 = b）具有相同值的每一行
finally each row that has the same values for col1 and col2 for which condition==TRUE , but with col1 and col2 switched (that is row 2, col1=b and col2=a) 最后，对于col1和col2具有相同值且condition==TRUE每一行，但切换了col1和col2（即第2行，col1 = b和col2 = a）

So only row 3 should stay. 因此，只有第3行可以保留。

I'm doing this by making a new data table DTcond with all rows meeting the condition, looping over the values for col1 and col2, and collecting the indices from DT which will be removed. 我这样做是通过创建一个新的数据表DTcond ，使所有符合条件的行，遍历col1和col2的值，并从DT收集将被删除的索引。

DTcond <- DT[condition==TRUE,]
indices <- c()
for (i in 1:nrow(DTcond)) {
    n1 <- DTcond[i, col1]
    n2 <- DTcond[i, col2]
    indices <- c(indices, DT[ ((col1 == n1 & col2 == n2) | (col1==n2 & col2 == n1)), which=T])
}

DT[!indices,]
   col1 col2 condition
1:    c    c     FALSE

This works but is terrible slow for large datasets and I guess there must be other ways in data.table to do this without loops or apply. 这可行，但是对于大型数据集来说速度太慢了，我想在data.table中必须有其他方法可以做到无循环或不应用。 Any suggestions how I could improve this (I'm new to data.table)? 有什么建议可以改善这一点（我是data.table的新手）？

Answer 1

You can do an anti join: 您可以进行反连接：

mDT = DT[(condition), !"condition"][, rbind(.SD, rev(.SD), use.names = FALSE)]
DT[!mDT, on=names(mDT)]

#    col1 col2 condition
# 1:    c    c     FALSE

从data.table中删除满足条件的行

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-09-21 00:30:39

从data.table中删除满足条件的行

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-09-21 00:30:39

解决方案1
4 已采纳 2017-09-21 00:30:39