[英]Include all rows where a value occurs, otherwise, remove
I have a dataset, df我有一个数据集,df
Subject
Hi
hello
RE: Hello
RE: How is work
No
Ok
RE: What time are
Hello RE: are you
FW: hello
I would like to include all rows where the first word is RE: and FW, excluding all others我想包括第一个单词是 RE: 和 FW 的所有行,不包括所有其他行
Subject
RE: Hello
RE: How is work
RE: What time are
FW: hello
Here is the dput:这是dput:
structure(list(Subject = structure(c(2L, 1L, 5L, 6L, 3L, 4L,
7L), .Label = c("hello", "HI", "No", "ok", "RE: Hello", "RE: How is work",
"RE: What time are", "FW: hello"), class = "factor")), class = "data.frame", row.names = c(NA,
-7L))
I am thinking to use grepl, but not sure how to formulate this.我正在考虑使用 grepl,但不确定如何制定它。
subset(df, grepl('^RE', 'FW', Subject))
You could combine the pattern with |
您可以将模式与|
结合使用
subset(df, grepl('^(RE|FW)', Subject))
Or using grep
或者使用grep
df[grep('^(RE|FW)', df$Subject), , drop = FALSE]
With tidyverse
, we can do使用tidyverse
,我们可以做到
library(dplyr)
library(stringr)
df %>%
filter(str_detect(Subject, '^(RE|FW)'))
# Subject
#1 RE: Hello
#2 RE: How is work
#3 RE: What time are
Or in base R
或者在base R
subset(df, startsWith(as.character(Subject),
"RE")|startsWith(as.character(Subject), "FW"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.