简体   繁体   English

使用向量而不是R中的regexp从字符串中删除多个单词

[英]Removing multiple words from a string using a vector instead of regexp in R

I would like to remove multiple words from a string in R, but would like to use a character vector instead of a regexp. 我想从R中的字符串中删除多个单词,但是想使用字符向量而不是正则表达式。

For example, if I had the string 例如,如果我有字符串

"hello how are you" 

and wanted to remove 并希望删除

c("hello", "how")

I would return 我会回来的

" are you"

I can get close with str_remove() from stringr 我可以从stringr接近str_remove()

"hello how are you" %>% str_remove(c("hello","how"))
[1]  "how are you"   "hello  are you"

But I'd need to do something to get this down into a single string. 但是我需要做些什么来把它变成一个单独的字符串。 Is there a function that does all of this on one call? 是否有一个功能可以在一次通话中完成所有这些操作?

We can use | 我们可以使用| to evaluate as a regex OR 评估为正则表达式OR

library(stringr)
library(magrittr)
pat <- str_c(words, collapse="|")
"hello how are you" %>% 
      str_remove_all(pat) %>%
      trimws
#[1] "are you"

data 数据

words <- c("hello", "how")

A base R possibility could be: base R可能是:

x <- "hello how are you"   
trimws(gsub("hello|how", "\\1", x))

[1] "are you"

Or if you have more words, a clever idea proposed by @Wimpel: 或者,如果你有更多的话,@ Wimpel提出了一个聪明的想法:

words <- paste(c("hello", "how"), collapse = "|")
trimws(gsub(words, "\\1", x))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM