[英]How to calculate distance between points in R with negative and positive coordinate
I have the following data frame of cluster points and their corresponding coordinates: 我有以下聚类点及其对应坐标的数据框:
library(tidyverse)
dat <- structure(list(clusters = c("1", "10", "11", "12", "13", "14",
"15", "2", "3", "4", "5", "6", "7", "8", "9"), X = c(-54.6159770964014,
-28.2872926332498, 52.8522393678039, -25.8140448004464, 38.9620763534183,
70.8641808918484, -15.1724011440888, 40.730220888559, 9.24483114349649,
-55.927722121683, -6.27401943653456, -64.5652744957147, 18.7919353226617,
20.0562482846276, -15.9544504453054), Y = c(8.22248244829743,
28.9054292231316, -34.6075657907431, -37.9486871165297, -12.736119840414,
-3.14128802462344, -1.12492457003011, 21.0867357880599, -17.678289925719,
40.2262495018696, 33.0017714263723, -24.491950293976, 56.579084048791,
-47.9835978682792, 71.6687592084785)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -15L))
dat
#> # A tibble: 15 x 3
#> clusters X Y
#> <chr> <dbl> <dbl>
#> 1 1 -54.6 8.22
#> 2 10 -28.3 28.9
#> 3 11 52.9 -34.6
#> 4 12 -25.8 -37.9
#> 5 13 39.0 -12.7
#> 6 14 70.9 -3.14
#> 7 15 -15.2 -1.12
#> 8 2 40.7 21.1
#> 9 3 9.24 -17.7
#> 10 4 -55.9 40.2
#> 11 5 -6.27 33.0
#> 12 6 -64.6 -24.5
#> 13 7 18.8 56.6
#> 14 8 20.1 -48.0
#> 15 9 -16.0 71.7
Visually it looks like this: 看起来像这样:
What I wanted to do is to calculate the distance between points. 我想做的是计算点之间的距离。 This is my attempt using Euclidean distance:
这是我使用欧几里德距离的尝试:
dm <- dist(dat[-1])
dm
The result is this: 结果是这样的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
2 33.48110
3 115.68851 103.04137
4 54.41809 66.89985 78.73720
5 95.89638 79.09802 25.90940 69.50985
6 125.99367 104.20176 36.25682 102.75327 33.31374
7 40.53603 32.76923 75.81846 38.33059 55.36571 86.06021
8 96.21012 69.45897 56.99823 88.95685 33.86904 38.66591 60.15364
9 68.91337 59.82226 46.77827 40.49708 30.12540 63.31089 29.49941 49.94053
10 32.03064 29.86895 132.03477 83.77442 108.66962 134.00347 58.05959 98.53466 87.18026
11 54.32272 22.39116 89.81613 73.59198 64.32930 85.18581 35.26773 48.49089 53.00286 50.17652
12 34.19390 64.55519 117.85244 41.02123 104.19267 137.10211 54.64132 114.73691 74.12393 65.29206 81.87428
13 87.90383 54.61030 97.34017 104.52365 72.19025 79.23409 66.95766 41.72523 74.86858 76.48818 34.41209 116.27956
14 93.46157 90.82412 35.41885 46.95512 39.99769 67.76635 58.62417 72.09802 32.17605 116.42397 85.15816 87.82175 104.57033
15 74.29767 44.50619 126.60576 110.05997 100.69761 114.60374 72.79788 75.97166 92.83264 50.85758 39.86034 107.74922 37.88152 124.95382
I found the result is not consistent with the figure. 我发现结果与图不符。 For example visually cluster 6 to 1 visually closer than cluster 6 to 2. But the distance calculated by
dist()
are: 例如,在视觉上群集6比1在视觉上比群集6到2更近。但是
dist()
计算的距离为:
6 to 1, distance = 125.99367
6 to 2, distance = 104.20176
What's the adequate way to calculate that so that the value is consistent with the plot? 有什么合适的方法可以计算出该值与图一致?
When you order your dataframe, you can get the distance matrix as you labelled it. 订购数据框时,您可以得到标注的距离矩阵。 Then it is inline with the visual plot.
然后它与视觉图一致。
dat1 <- dat[order(as.numeric(dat$clusters)),]
> dist(dat1)
1 2 3 4 5 6 7 8 9 10 11 12 13 14
2 96.21531
3 68.94239 49.95055
4 32.17082 98.55495 87.18599
5 54.46979 48.58360 53.04058 50.18649
6 34.55753 114.80661 74.18462 65.32268 81.88039
7 88.10836 42.02375 74.97536 76.54699 34.47016 116.28386
8 93.72334 72.34725 32.56222 116.49266 85.21099 87.84452 104.57511
9 74.72713 76.29347 93.02633 51.10278 40.06054 107.79097 37.93428 124.95782
10 34.66964 69.91816 60.23042 30.46562 22.94263 64.67899 54.69264 90.84614 44.51743
11 116.11990 57.70441 47.45742 132.22019 90.01631 117.95846 97.42233 35.54568 126.62155 103.04622
12 55.51872 89.51715 41.48510 84.15554 73.92415 41.45771 104.64317 47.12519 110.10085 66.92974 78.74355
13 96.64427 35.61056 31.74177 109.04167 64.82483 104.42754 72.43916 40.30899 100.77702 79.15489 25.98647 69.51704
14 126.66256 40.48522 64.25939 134.37608 85.65992 137.33531 79.54270 68.03145 114.71276 104.27851 36.38072 102.77273 33.32874
15 42.88554 61.54235 31.84674 59.09243 36.65805 55.37756 67.43388 59.04060 73.04473 33.14849 75.92390 38.44781 55.40182 86.06602
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.