[英]How to match multiple strings in one column with multiple strings in another column remove matches in R?
这是我的代码:
A <- c("ruler measure", "measure rulers", "rulers")
B <- c("you can measure things with rulers", "you can measure things with rulers", "you can measure things with rulers")
df <- data.frame(as.character(A), as.character(B))
df_new <- df %>%
mutate(
new_B = str_replace_all(B, A, "")
)
我想要的是列看起来像这样:
A B
ruler measure you can things with
measure rulers you can things with
rulers you can measures things with
但是, str_replace_all() 似乎只替换了 A 和 B 的一个匹配项(例如,标尺),而不是另一个匹配项(例如,度量)
谢谢您的帮助!!
我们可以用|
替换空格
library(dplyr)
library(stringr)
df %>%
mutate(new_B = str_replace_all(B, str_replace(A, " ", "|"), ''))
这是一个基本的 R 解决方案
df <- within(df,
new_B <- mapply(gsub,
sapply(strsplit(as.character(A),"\\s+"),
function(v) paste0(paste0("\\s+?",v,".*?\\b"),collapse = "|")),
"",
B))
以至于
> df
A B new_B
1 ruler measure you can measure things with rulers you can things with
2 measure rulers you can measures things with rulers you can things with
3 rulers you can measure things with rulers you can measure things with
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.