[英]deleting rows of a data frame with specific condition
I've seen similar questions here, but I couldn't find any help. 我在这里看到过类似的问题,但找不到任何帮助。
I have a df like this: 我有这样的df:
df <- data.frame(CSF1=c(-9,-9,-9,-9), CSF2=c(-9,-1,-9,-9),
D13S1=c(-9,-9,11,11), D13S2=c(-9,-9,11,12))
CSF1 CSF2 D13S1 D13S2
10398 -9 -9 -9 -9
10398 -9 -1 -9 -9
20177 -9 -9 11 11
20361 -9 -9 11 12
I want to delete all the rows with values -9 or -1 for all columns, like the first 2 rows. 我想删除所有列的值均为-9或-1的所有行,例如前2行。
Thanks! 谢谢!
All I will add is that the which
function doesn't appear to be necessary. 我要补充的是,似乎不需要
which
功能。 Removing it yields the same result. 删除它会产生相同的结果。
There is a secondary problem that you would have in situations with missing data. 在缺少数据的情况下,您将遇到第二个问题。 If, you add an
NA
to the 3rd row (try it with df[3,4] <- NA
), then the output of the above solution will omit the 3rd row as well regardless of the other entries' values. 如果将
NA
添加到第三行(请使用df[3,4] <- NA
尝试),则上述解决方案的输出也会忽略第三行,而与其他条目的值无关。 I won't suggest alternatives as this may not be a problem for your data set. 我不会建议替代方法,因为这可能对您的数据集来说不是问题。
Try this (edited by Arun to account for Dov's post): 尝试以下操作(由Arun编辑,以解释Dov的帖子):
df[rowSums(df == -1 | df == -9, na.rm = TRUE) != ncol(df), ]
## CSF1 CSF2 D13S1 D13S2
## 3 -9 -9 11 11
## 4 -9 -9 11 12
(df == -1 | df == -9)
will give you logical matrix. (df == -1 | df == -9)
将为您提供逻辑矩阵。 rowSums
will give you count of TRUE
in each row since TRUE
is evaluated as 1
. rowSums
将TRUE
评估为1
rowSums
将在每一行中为TRUE
计数。 The na.rm=TRUE
is to ensure that rows with NA
are not omitted (see Dov's post). na.rm=TRUE
是为了确保不省略带有NA
行(请参阅Dov的文章)。 Use resultant row numbers to subset df
. 使用结果行号作为
df
子集。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.