简体   繁体   English

比较r中两列中的字符串

[英]Compare strings in two columns in r

Suppose I've got this data:假设我有这个数据:

 ColA               ColB             
------------       ------------------------
 apple tree         Mary has an apple tree
 orange+apple       Lucy loves orange+apple
 orange apple       Anne loves orange+apple

I want to evaluate if ColB contains ColA and create a logical variable:我想评估 ColB 是否包含 ColA 并创建一个逻辑变量:

  ColA               ColB                       Ind
------------       ------------------------     -----
 apple tree         Mary has an apple tree      TRUE
 orange+apple       Lucy loves orange+apple     TRUE
 orange apple       Anne loves orange+apple     FALSE

Any Suggestions using R?使用 R 的任何建议?

Many thanks!非常感谢!

We can use str_detect which is vectorized for both patterns and string我们可以使用str_detect ,它对模式和字符串都进行了矢量化

library(dplyr)
library(stringr)
df1 <- df1 %>%
           mutate(Ind = str_detect(ColB, fixed(ColA)))

-output -输出

df1
#         ColA                    ColB   Ind
#1   apple tree  Mary has an apple tree  TRUE
#2 orange+apple Lucy loves orange+apple  TRUE
#3 orange apple Anne loves orange+apple FALSE

data数据

df1 <- structure(list(ColA = c("apple tree", "orange+apple", "orange apple"
), ColB = c("Mary has an apple tree", "Lucy loves orange+apple", 
"Anne loves orange+apple")), class = "data.frame", row.names = c(NA, 
-3L))

Here is a base R option using Vectorize over grepl这是使用Vectorize over grepl的基本 R 选项

within(
  df,
  Ind <- Vectorize(grepl)(ColA,ColB,fix = TRUE)
)

giving给予

          ColA                    ColB   Ind
1   apple tree  Mary has an apple tree  TRUE
2 orange+apple Lucy loves orange+apple  TRUE
3 orange apple Anne loves orange+apple FALSE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM