简体   繁体   中英

R grepl in dataframe

I am trying to check whether strings in a column appear in a different column. I tried grepl :

grepl("b", "d,b,c", fixed = TRUE)
> TRUE

which works fine on "standalone" objects, but in a dataframe:

 df = data.frame(id = c("a","b"), ids = c("b,c", "d,b,c")) %>%
     mutate(match = grepl(id, .$ids, fixed = TRUE), truematch = c(FALSE, TRUE))

> df
  id   ids match truematch
1  a   b,c FALSE     FALSE
2  b d,b,c FALSE      TRUE

it does not result in what I expected, ie I am trying to create the column truematch but I can only produce match

Since grepl is not vectorised, we can use rowwise to apply it for each row

library(dplyr)

df %>%
  rowwise() %>%
  mutate(truematch = grepl(id, ids, fixed = TRUE))

#  id    ids   match truematch
#  <fct> <fct> <lgl> <lgl>    
#1 a     b,c   FALSE FALSE    
#2 b     d,b,c FALSE TRUE     

However, rowwise is kind of outdated, we can use purrr::map2_lgl with grepl

df %>% mutate(truematch = purrr::map2_lgl(id, ids, grepl, fixed = TRUE))

However, for this case a better option is stringr::str_detect which is vectorised over string and pattern

df %>% mutate(truematch = stringr::str_detect(ids, fixed(id)))

By using sapply over grepl ,

 df %>%  mutate(match = sapply(1:nrow(.),function(x) grepl(.$id[x], .$ids[x])))

gives,

  id   ids  match
1  a   b,c FALSE
2  b d,b,c  TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM