简体   繁体   中英

R - Find if characters of a vector are in another vector

I have a doubt very similar to this topic here: Find matches of a vector of strings in another vector of strings .

I have a vector of clients, and if the name indicates that is a commercial client, I need to change the type in my data.frame.

So, suppose that:

commercial_names <- c("BAKERY","MARKET", "SCHOOL", "CINEMA")
clients <- c("JOHN XX","REESE YY","BAKERY ZZ","SAMANTHA WW")

I tried the code in the topic cited before, but I had an error:

> grepl(paste(commercial_names, collape="|"),clients)
[1] TRUE TRUE TRUE TRUE
Warning message:
In grepl(paste(commercial_names, collape = "|"), clients) :
  argument 'pattern' has length > 1 and only the first element will be used

What am I doing wrong? I would thank any help.

Your code is correct but for a typo:

grepl(paste0(commercial_names, collapse = "|"), clients) # typo: collape
[1] FALSE FALSE  TRUE FALSE

Given the typo, the commercial_names are not collapsed.

Not sure how to do this with a one-liner but a loop seems to do the trick.

sapply(clients, function(client) {
  any(str_detect(client, commercial_names))
})
> JOHN XX    REESE YY   BAKERY ZZ SAMANTHA WW 
> FALSE       FALSE        TRUE       FALSE 

I found another way of to do this, with the command %like% of package data.table :

> clients %like% paste(commercial_names,collapse = "|")
[1] FALSE FALSE  TRUE FALSE

You can do something like this too:

clients.first <- gsub(" ..", "", clients)
clients.first %in% commercial_names

This returns:

[1] FALSE FALSE  TRUE FALSE

You might need to change the regular expression for gsub if your clients data is more heterogeneous though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM