[英]Delete rows containing specific strings in R
I would like to exclude lines containing a string "REVERSE", but my lines do not match exactly with the word, just contain it.我想排除包含字符串“REVERSE”的行,但我的行与单词不完全匹配,只包含它。
My input data frame:我的输入数据框:
Value Name
55 REVERSE223
22 GENJJS
33 REVERSE456
44 GENJKI
My expected output:我的预期输出:
Value Name
22 GENJJS
44 GENJKI
This should do the trick:这应该可以解决问题:
df[- grep("REVERSE", df$Name),]
Or a safer version would be:或者更安全的版本是:
df[!grepl("REVERSE", df$Name),]
You could use dplyr::filter()
and negate a grepl()
match:您可以使用
dplyr::filter()
并否定grepl()
匹配:
library(dplyr)
df %>%
filter(!grepl('REVERSE', Name))
Or with dplyr::filter()
and negating a stringr::str_detect()
match:或使用
dplyr::filter()
并否定stringr::str_detect()
匹配:
library(stringr)
df %>%
filter(!str_detect(Name, 'REVERSE'))
Actually I would use:实际上我会使用:
df[ grep("REVERSE", df$Name, invert = TRUE) , ]
This will avoid deleting all of the records if the desired search word is not contained in any of the rows.如果所需的搜索词不包含在任何行中,这将避免删除所有记录。
You can use stri_detect_fixed function from stringi
package您可以使用
stringi
包中的 stri_detect_fixed 函数
stri_detect_fixed(c("REVERSE223","GENJJS"),"REVERSE")
[1] TRUE FALSE
如果它是多个字符串,则可以使用此函数df[!grepl("REVERSE|GENJJS", df$Name),]
You can use it in the same datafram (df) using the previously provided code您可以使用之前提供的代码在同一个数据帧 (df) 中使用它
df[!grepl("REVERSE", df$Name),]
or you might assign a different name to the datafram using this code或者您可以使用此代码为数据帧分配不同的名称
df1<-df[!grepl("REVERSE", df$Name),]
A late answer building on BobD59's and hidden-layer's responses.基于 BobD59 和隐藏层响应的迟到答案。
This removes multiple specific strings, whilst avoiding deleting all of the records if the desired search word is not contained in any of the rows.这将删除多个特定字符串,同时避免在任何行中不包含所需搜索词时删除所有记录。
df1 <-
df[!grepl("REVERSE|GENJJS", df$Name), (invert = TRUE), ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.