从数据框中删除行

Question

I have this data.frame : 我有这个data.frame ：

set.seed(1)
df <- data.frame(id1=LETTERS[sample(26,100,replace = T)],id2=LETTERS[sample(26,100,replace = T)],stringsAsFactors = F)

and this vector : 和这个vector ：

vec <- LETTERS[sample(26,10,replace = F)]

I want to remove from df any row which either df$id1 or df$id2 are not in vec 我想从df删除df$id1或df$id2不在vec任何行

Is there any faster way of finding the row indices which meet this condition than this: 是否有比此条件更快的找到满足此条件的行索引的方法：

rm.idx <- which(!apply(df,1,function(x) all(x %in% vec)))

Answer 1

我用dplyr这样的脚本

df1 <- df %>% filter(!(df$id1 %in%  vec)|!(df$id2 %in% vec))

Answer 2

Looping over the columns might be faster than over rows. 在列上循环可能比在行上循环更快。 So, use lapply to loop over the columns, create a list of logical vector s with %in% , use Reduce with | 因此，使用lapply遍历各列，使用%in%创建一个逻辑vector s list ，使用Reduce with | to check whether there are any TRUE values for each corresponding row and use that to subset the 'df' 检查每个对应的行是否有TRUE值，并使用它来对'df'进行子集化

df[Reduce(`|`, lapply(df, `%in%`, vec)),]

If we need both elements, then replace | 如果我们需要两个元素，则替换| with & 与&

df[Reduce(`&`, lapply(df, `%in%`, vec)),]

Answer 3

Actually 其实

rm.idx <- unique(which(!(df$id1 %in% vec) | !(df$id2 %in% vec)))

is also fast. 也很快。

从数据框中删除行

问题描述

3 个解决方案

解决方案1
2 2016-11-16 06:45:11

解决方案2
1 2016-11-16 06:18:03

解决方案3
1 2016-11-16 06:24:02

从数据框中删除行

问题描述

3 个解决方案

解决方案1 2 2016-11-16 06:45:11

解决方案2 1 2016-11-16 06:18:03

解决方案3 1 2016-11-16 06:24:02

解决方案1
2 2016-11-16 06:45:11

解决方案2
1 2016-11-16 06:18:03

解决方案3
1 2016-11-16 06:24:02