简体   繁体   English

包括出现值的所有行,否则,删除

[英]Include all rows where a value occurs, otherwise, remove

I have a dataset, df我有一个数据集,df

Subject

Hi
hello
RE: Hello
RE: How is work
No
Ok
RE: What time are 
Hello RE: are you
FW: hello

I would like to include all rows where the first word is RE: and FW, excluding all others我想包括第一个单词是 RE: 和 FW 的所有行,不包括所有其他行

Subject


RE: Hello
RE: How is work
RE: What time are 
FW: hello

Here is the dput:这是dput:

 structure(list(Subject = structure(c(2L, 1L, 5L, 6L, 3L, 4L, 
 7L), .Label = c("hello", "HI", "No", "ok", "RE: Hello", "RE:   How     is work", 
 "RE: What time are", "FW: hello"), class = "factor")), class = "data.frame",       row.names = c(NA, 
 -7L))

I am thinking to use grepl, but not sure how to formulate this.我正在考虑使用 grepl,但不确定如何制定它。

subset(df, grepl('^RE', 'FW', Subject)) 

You could combine the pattern with |您可以将模式与|结合使用

subset(df, grepl('^(RE|FW)', Subject))

Or using grep或者使用grep

df[grep('^(RE|FW)', df$Subject), , drop = FALSE]

With tidyverse , we can do使用tidyverse ,我们可以做到

library(dplyr)
library(stringr)
df %>%
   filter(str_detect(Subject, '^(RE|FW)'))
#               Subject
#1             RE: Hello
#2 RE:   How     is work
#3     RE: What time are

Or in base R或者在base R

subset(df, startsWith(as.character(Subject), 
           "RE")|startsWith(as.character(Subject), "FW"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM