简体   繁体   English

如何为另一个向量中的每个元素获取向量中最接近的元素而不重复?

[英]How to get the closest element in a vector for every element in another vector without duplicates?

I got this code which create two vectors and for each element from a I want to get the closest element in b : 我得到这个代码,创建两个向量,并从每个元素a我想在最近的元素b

a = rnorm(100)
b = rnorm(100)
c = vapply(a, function(x) which.min(abs(b - x)), 1)
table(duplicated(c))

FALSE  TRUE 
   61    39 

As you can see this method is prompt to give a lot of duplicates which is normal but I would like to not have duplicates. 你可以看到这个方法很快就会提供很多重复,这是正常的,但我不想重复。 I thought of deleting occurence from b once an index has been selected but I don't know how to do it under vapply . 我想过一旦选择了索引就从b中删除了出现但我不知道如何在vapplyvapply

The closest match you are going to get is by sorting the vectors and then pairing them off. 您将获得的最接近的匹配是对矢量进行排序,然后将它们配对。 The following permuation on b should allow you to do that. b上的以下布局应该允许您这样做。

p <- order(b)[order(order(a))] # order on b and then back transform the ordering of a

sum(abs(a-b[p]))
[1] 20.76788

Clearly, allowing duplicates does make things much closer: 显然,允许重复可以使事情更加接近:

sum(abs(a-b[c]))
[1] 2.45583

I believe this is the best you can get: sum(abs(sort(a) - sort(b))) 我相信这是你能得到的最好的: sum(abs(sort(a) - sort(b)))

I am using data.table to preserve the original sorting of a : 我使用data.table保留原来的排序a

require(data.table)

set.seed(1)

a <- rnorm(100)
b <- rnorm(100)

sum(abs(a - b))
sum(abs(sort(a) - sort(b)))

dt <- data.table(a = a, b = b)
dt[, id := .I]

# sort dt by a
setkey(dt, a)

# sort b
dt[, b := sort(b)]

# return to original order
setkey(dt, id)

dt
dt[, sum(abs(a - b))]

This solution gives better result if compared to Chase's solution: 与Chase的解决方案相比,此解决方案可提供更好的结果:

dt2 <- as.data.table(foo(a,b))
dt2[, sum(abs(a - bval))]
dt[, sum(abs(a - b))]

Result: 结果:

> dt2[, sum(abs(a - bval))]
[1] 24.86731
> dt[, sum(abs(a - b))]
[1] 20.76788

This is very bad programming, but may work and is vectorized... 这是非常糟糕的编程,但可能工作,并矢量化...

   a <- rnorm(100)
   b <- rnorm(100)
   #make a copy of b (you'll see why)
   b1<-b
   res<- vapply(a, function(x) {ret<-which.min(abs(b1 - x));b1[ret]<<-NA;return(ret)}, 1)

This can almost certainly be improved upon through vectorization, but appears to work and may get the job done: 通过矢量化几乎可以肯定地改进这一点,但似乎有效并可能完成工作:

set.seed(1)
a = rnorm(5)
b = rnorm(5)

foo <- function(a,b) {

  out <- cbind(a, bval = NA)

  for (i in seq_along(a)) {
    #which value of B is closest?
    whichB <- which.min(abs(b - a[i]))
    #Assign that value to the bval column
    out[i, "bval"] <- b[whichB]
    #Remove that value of B from being chosen again
    b <- b[-whichB]
  }

  return(out)

}

#In action
foo(a,b)
---
              a       bval
[1,] -0.6264538 -0.8204684
[2,]  0.1836433  0.4874291
[3,] -0.8356286 -0.3053884
[4,]  1.5952808  0.7383247
[5,]  0.3295078  0.5757814

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 对于该向量中的每个元素,如何使用 function 将向量的元素和同一向量中的下一个元素作为 arguments ? - How can I use a function that takes as arguments an element of a vector and the next element in the same vector, for every element in that vector? 如何每五个元素分隔一个向量? - How to separate a vector every fifth element? 如何 substring 字符串向量中的每个元素? - How to substring every element in vector of strings? 如何从向量中获取元素而不使用数字或索引? - How to get an Element from a vector without using numbers or indices? 从向量B的每个元素中减去向量A的每个元素 - Subtract every element of vector A from every element of vector B 提取向量的每个第 n 个元素 - Extract every nth element of a vector 如何计算向量中每个元素的另一个向量中较小的元素的分数? - How to calculate for each element in a vector the fraction of elements in another vector that is smaller? 在不使用“SUM”函数的情况下添加 R 中向量中的每个元素 - Add every element in a vector in R without using 'SUM' function 获取R中vector中元素的名称? 没有元素的名称 - Get the name of an element in vector in R? without the element's name 将一个向量与另一个向量的每个元素进行比较 - Comparing a vector against each element of another vector
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM