简体   繁体   中英

r, dplyr: how to transform values in one column based on value in another column using gsub

I have a dataframe with two (relevant) factors, and I'd like to remove a substring equal to one factor from the value of the other factor, or leave it alone if there is no such substring. Can I do this using dplyr ?

To make a MWE, suppose these factors are x and y .

library(dplyr)
df <- data.frame(x = c(rep('abc', 3)), y = c('a', 'b', 'd'))

df :

      x y
1   abc a
2   abc b
3   abc d

What I want:

      x y
1    bc a
2    ac b
3   abc d

My attempt was:

df |> transform(x = gsub(y, '', x))

However, this produces the following, incorrect result, plus a warning message:

    x y
1  bc a
2  bc b
3  bc d

 Warning message:
 In gsub(y, "", x) :
    argument 'pattern' has length > 1 and only the first element will be used

How can I do this?

str_remove is vectorized for the pattern instead of gsub

library(stringr)
library(dplyr)
df <- df %>% 
    mutate(x = str_remove(x, y))

-output

df
    x y
1  bc a
2  ac b
3 abc d

If we want to use sub/gsub , then may need rowwise

df %>%
   rowwise %>%
   mutate(x = sub(y, "", x)) %>%
   ungroup

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM