简体   繁体   English

如何在r中返回具有部分子字符串匹配的逻辑向量

[英]How to return a logical vector with a partial substring match in r

So, what I would like to do is on a substring match one column with another column and return true if there is a partial match 因此,我想在子字符串上将一列与另一列进行匹配,如果存在部分匹配,则返回true

A            B          C
hello      helloworld  true
worldhello hello       true
dog        hello       false

Here is a quick example of my two columns (A and B) and the logical vector I would like returned (C) 这是我的两列(A和B)以及我想返回的逻辑向量(C)的快速示例。

Calling your example df , this would do it: 调用示例df可以做到这一点:

sapply(1:nrow(df),function(i)with(df[i,],grepl(A,B)|grepl(B,A)))
# [1]  TRUE  TRUE FALSE

There's probably a more efficient way, though. 不过,可能有一种更有效的方法。

old question but for the record: you can also use dplyr to achieve this: 记录下来,这是一个老问题:您还可以使用dplyr实现此目的:

taking the matrix from @darwin and applying the solution from @jhoward the solution looks like this: 从@darwin提取矩阵并从@jhoward应用解决方案,解决方案如下所示:

as.data.frame(a) %>% 
 rowwise() %>% 
 mutate(V3 = grepl(V1,V2)|grepl(V2,V1) )

you need the rowise() as grepl doesn't take a vector. 您需要rowise()因为grepl不使用向量。

I think using one of the grep functions will be your best bet. 我认为使用grep函数之一将是您最好的选择。 And since you need to match in either column, then you will have to do it twice. 并且由于您需要在任一列中进行匹配,因此您将必须进行两次。 I did basically the same thing as jlhoward, but mine is in a for loop. 我做的事情基本上与jlhoward相同,但是我的事情处于for循环中。

a <- matrix(data=c("hello", "helloworld", "worldhello", "hello", "dog", "hello"), nrow=3, byrow=TRUE)
b <- rep(NA, dim(a)[1])
for(i in sequence(dim(a)[1])){
  b[i] <- sum(length(grep(a[i,1], a[i,2])), length(grep(a[i,2], a[i,1]))) > 0
}
cbind(a,b)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM