简体   繁体   English

在R中遍历2个不同维的向量

[英]Looping through 2 vectors of different dimension in R

I have two character vectors a, b with different dimensions. 我有两个具有不同尺寸的字符向量a,b。 I have to take each element in a and compare with all elements in b and note the element if there is a close match. 我必须将a中的每个元素与b中的所有元素进行比较,并注意是否存在紧密匹配的元素。 For matching I'm using agrepl function. 为了匹配,我使用agrepl函数。

Following is the sample data 以下是样本数据

a <- c("US","Canada","United States","United States of America")
b <- c("United States","U.S","United States","Canada", "America", "Spain")

Following is the code that I'm using to match. 以下是我用来匹配的代码。 Please help me how to avoid for loop as my real data has more 900 and 5000 records respectively 请帮助我如何避免for循环,因为我的真实数据分别具有900和5000条记录

for(i in 1:4)
{
    for(j in 1:6)
    {
      bFlag <- agrepl(a[i],b[j],  max.distance = 0.1,ignore.case = TRUE)

      if(bFlag)
      {
        #Custom logic
      }
      else 
      {
        #Custom logic
      }
    }
}

You don't need a double loop, since agrepl 's second argument accepts vectors of length >= 1. So you could do something like: 您不需要双重循环,因为agrepl的第二个参数接受长度大于等于1的向量。因此,您可以执行以下操作:

lapply(a, function(x) agrepl(x, b, max.distance = 0.1, ignore.case = TRUE))
# [[1]]
# [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE
# 
# [[2]]
# [1] FALSE FALSE FALSE  TRUE FALSE FALSE
# 
# [[3]]
# [1]  TRUE FALSE  TRUE FALSE FALSE FALSE
# 
# [[4]]
# [1] FALSE FALSE FALSE FALSE FALSE FALSE

You can add some custom logic inside the lapply call if needed, but that's not specified in the question so I'll just leave the output as a list of logical s. 您可以根据需要在lapply调用中添加一些自定义逻辑,但这未在问题中指定,因此我只将输出保留为logical s的列表。

If you want indices (of TRUEs) instead of logicals, you can use agrep instead of agrepl : 如果您要使用索引(真值)而不是逻辑索引,则可以使用agrep代替agrepl

lapply(a, function(x) agrep(x, b, max.distance = 0.1,ignore.case = TRUE))

# [[1]]
# [1] 1 2 3 6
# 
# [[2]]
# [1] 4
# 
# [[3]]
# [1] 1 3
# 
# [[4]]
# integer(0)

If you only want the first TRUE index, you can use: 如果只需要第一个TRUE索引,则可以使用:

sapply(a, function(x) agrep(x, b, max.distance = 0.1,ignore.case = TRUE)[1])
#  US                   Canada            United States United States of America 
#   1                        4                        1                       NA 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM