计算R中多个向量之间的欧式距离

Question

I'm trying to calculate the euclidean distances between one vector on the one hand and multiple vectors on the other hand using R.我正在尝试使用 R 一方面计算一个向量与另一方面多个向量之间的欧氏距离。

So far, I've been following this documentation https://cran.r-project.org/web/packages/neighbr/neighbr.pdf and used distance(x, y, "euclidean").到目前为止，我一直在关注此文档https://cran.r-project.org/web/packages/neighbr/neighbr.pdf并使用 distance(x, y, "euclidean")。 This works perfectly well if I only calculate the distance between two vectors, ie when I have one row of data in both x and y.如果我只计算两个向量之间的距离，即当我在 x 和 y 中都有一行数据时，这非常有效。 However, in my original dataset, I have multiple rows in y and I'd like to calculate the distances between each of these rows and the single row in x.但是，在我的原始数据集中，我在 y 中有多行，我想计算这些行中的每一行与 x 中的单行之间的距离。

How is it possible to do this?这怎么可能呢？

x = structure(list(`Feature I` = 0.85649790378586, `Feature II` = 0.851856356221207, `Feature III` = 0.799580263077569, `Feature IV` = 0.895081402129565, `Feature V` = 0.920173237422567), row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame"))

y = structure(list(`Feature I` = c(0.0444280626160322, 0.00326398594129033, 0.0218000692329814), `Feature II` = c(0.0481646509894741, 0.00509786237104908, 0.0276902769176258), `Feature III` = c(0.0456380620204004, 0.00422956673025977, 0.0347273727088683), `Feature IV` = c(0.0365954415011219, 0.00422974884164406, 0.0328151120410415), `Feature V` = c(0.0384331094111439, 0.00362614754925969, 0.0260414956219995)), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))

Answer 1

Adapting this answer to your data:根据您的数据调整此答案：

y$dist_from_x = t(outer(
  1:nrow(x),
  1:nrow(y),
  FUN = Vectorize(function(xi,yi) dist(rbind(x[xi,],y[yi,])))
))

y
#     Feature I  Feature II Feature III  Feature IV   Feature V dist_from_x
# 1 0.044428063 0.048164651 0.045638062 0.036595442 0.038433109    1.840726
# 2 0.003263986 0.005097862 0.004229567 0.004229749 0.003626148    1.926465
# 3 0.021800069 0.027690277 0.034727373 0.032815112 0.026041496    1.871883

Since x has one row, this would be a little more efficient:由于x有一行，这样会更有效率：

# reset definition of y (or remove the dist_from_x column)
x_expanded = x[rep(1, nrow(y)), ]
y$dist_from_x = sqrt(rowSums((x_expanded - y)^2))
# same result as above

计算R中多个向量之间的欧式距离

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-10-28 16:55:25

计算R中多个向量之间的欧式距离

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-10-28 16:55:25

解决方案1
2 已采纳 2020-10-28 16:55:25