简体   繁体   中英

Removing multiple words from a string using a vector instead of regexp in R

I would like to remove multiple words from a string in R, but would like to use a character vector instead of a regexp.

For example, if I had the string

"hello how are you" 

and wanted to remove

c("hello", "how")

I would return

" are you"

I can get close with str_remove() from stringr

"hello how are you" %>% str_remove(c("hello","how"))
[1]  "how are you"   "hello  are you"

But I'd need to do something to get this down into a single string. Is there a function that does all of this on one call?

We can use | to evaluate as a regex OR

library(stringr)
library(magrittr)
pat <- str_c(words, collapse="|")
"hello how are you" %>% 
      str_remove_all(pat) %>%
      trimws
#[1] "are you"

data

words <- c("hello", "how")

A base R possibility could be:

x <- "hello how are you"   
trimws(gsub("hello|how", "\\1", x))

[1] "are you"

Or if you have more words, a clever idea proposed by @Wimpel:

words <- paste(c("hello", "how"), collapse = "|")
trimws(gsub(words, "\\1", x))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM