[英]r, dplyr: how to transform values in one column based on value in another column using gsub
I have a dataframe with two (relevant) factors, and I'd like to remove a substring equal to one factor from the value of the other factor, or leave it alone if there is no such substring.我有一个具有两个(相关)因素的 dataframe,我想从另一个因素的值中删除一个等于一个因素的 substring,或者如果没有这样的 substring,则不理会它。 Can I do this using
dplyr
?我可以使用
dplyr
做到这一点吗?
To make a MWE, suppose these factors are x
and y
.要制作 MWE,假设这些因素是
x
和y
。
library(dplyr)
df <- data.frame(x = c(rep('abc', 3)), y = c('a', 'b', 'd'))
df
: df
:
x y
1 abc a
2 abc b
3 abc d
What I want:我想要的是:
x y
1 bc a
2 ac b
3 abc d
My attempt was:我的尝试是:
df |> transform(x = gsub(y, '', x))
However, this produces the following, incorrect result, plus a warning message:但是,这会产生以下不正确的结果以及警告消息:
x y
1 bc a
2 bc b
3 bc d
Warning message:
In gsub(y, "", x) :
argument 'pattern' has length > 1 and only the first element will be used
How can I do this?我怎样才能做到这一点?
str_remove
is vectorized for the pattern
instead of gsub
str_remove
针对pattern
而不是gsub
进行矢量化
library(stringr)
library(dplyr)
df <- df %>%
mutate(x = str_remove(x, y))
-output -输出
df
x y
1 bc a
2 ac b
3 abc d
If we want to use sub/gsub
, then may need rowwise
如果我们想使用
sub/gsub
,那么可能需要rowwise
df %>%
rowwise %>%
mutate(x = sub(y, "", x)) %>%
ungroup
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.