简体   繁体   English

通过基于R中的向量的值选择行来新的子集

[英]New subset by selecting rows based on values of a vector in R

I have a data set U1 over which I run a classifier and get a vector of labels 我有一个数据集U1,我在其上运行分类器并获得标签向量

pred.U1.nb.c <- predict(NB.C, U1[,2:6])
table(pred.U1.nb.c)
pred.U1.nb.c
    S unlabeled 
  148      5852 
> head(pred.U1.nb.c)
  [1] S S S S S S
  Levels: S unlabeled

Now I want to pull out those rows of U1 which were classified as S in U1.S. 现在我想拉出那些在U1.S中被归类为S的U1行。 What is the most efficient way to do this? 最有效的方法是什么?

The answer by James has elegant economy going for it and would certainly work correctly with this example, but it is prone to undesirable results if the tested vector has any NA's. 詹姆斯的答案具有优雅的经济性,并且肯定会在这个例子中正确运行,但如果测试的矢量有任何NA,则很容易产生不良结果。 (I have been bitten many times and been puzzled.) Here are two safer ways that avoid the NA -inclusive behavior of the "[" function: (我被困多次并感到困惑。)以下两种更安全的方法可以避免“[”函数的NA -inclusive行为:

U1[which(pred.U1.nb.c=="S"), ]

This converts the logical vector (possibly with NA's) into a numerical vector with no NA's. 这会将逻辑矢量(可能带有NA)转换为没有NA的数值向量。 Can also use subset: 也可以使用子集:

subset(U1 ,pred.U1.nb.c=="S")

EDIT: I suspect that using grepl would also avoid the NA concern. 编辑:我怀疑使用grepl也会避免NA问题。 Perhaps: 也许:

U1[grepl("^S$", pred.U1.nb.c), ]
U1[pred.U1.nb.c=="S",]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM