如何測量單獨數據框中點之間的距離？

Question

我用 geom 列（POINT 類型）創建了 2 個數據框。 現在我想計算每對點之間的距離，例如第一個df中第一行的點與第二個df中第一行的點等。這是我的數據框：

df1 <- table %>%
  st_as_sf(coords = c("lonCust","latCust"), crs = 4326)

df2 <- table %>%
  st_as_sf(coords = c("lonApp","latApp"), crs = 4326)

我用st_distance ：

distance <- st_distance(df1$geometry,df2$geometry)

但我得到了一個矩陣，其中計算了兩個 geom 列中每一對的距離：

           [,1]      [,2]        [,3]         [,4]        [,5]  ...
[1,]   139.7924 7735.5718 15225.02995   558.104089  1016.58121
[2,]  8503.0544  755.2915  8764.75396  7957.289600  8788.02800
[3,] 15306.5855 9336.9008    18.96914 14876.589918 15929.51643
[4,]   548.3045 7232.0164 14898.70637     8.094068  1078.38236
[5,]   911.5635 8084.3086 15993.36365  1127.730022    46.97799
.
.

我希望在一列中計算距離，僅在相應的幾何行之間：

           [,1]     
[1,]   139.7924 
[2,]  8503.0544
[3,] 15306.5855 
[4,]   548.3045
[5,]   911.5635
.
.

我讀到了geosphere package 但sf有非常好的st_distance function 來測量距離，我想用它。 最重要的是，我需要先加入這些數據框嗎？ 來自dplyr的簡單inner_join不允許加入兩個空間數據幀，另一方面st_join對我來說不是一個選項，因為我不想通過幾何加入（兩個數據幀中的幾何完全不同）

Answer 1

正如@mrhellmann 提到的，您可以添加by_element=T並且應該可以。 如果速度仍然是一個問題，我建議使用 Geosphere package 中的geosphere DistGeo() 。 但請務必查看文檔以查看您的數據是否適合此 function。

library(geosphere)
library(tidyverse)
library(sf)

df1 <- table %>%
  st_as_sf(coords = c("lonCust","latCust"), crs = 4326)

doParallel::registerDoParallel()
df_crs4326 <- df1 %>%
  group_by(your_id_here) %>% 
  mutate(
    lonCust = map(geometry, 2) %>% unlist(),
    latCust= map(geometry, 1) %>% unlist(),
    # geometry_2 = st_as_sfc(coords = c("lonApp","latApp"), crs = 4326)
    ) %>%
  mutate(
    distance_to_next = distGeo(c(lonCust, latCust), c(lonApp, latApp)) %>% set_units(m),
    # distance_2 = st_distance(geometry, geometry_2, by_element = TRUE)
    ) %>%
    ungroup()

請注意，如果沒有對可重現數據進行測試，我不確定被注釋掉的部分是否有效。

Answer 2

超快速向量化計算

此方法通過以下方式起作用：

將（經度、緯度）坐標投影到與您感興趣的區域等距的相關坐標系。 （等距坐標系保留點之間的距離測量值，因此您可以只使用基本幾何來計算距離）。
將幾何圖形轉換為具有 X 和 Y 列的 Base R 矩陣。
最后，簡單地使用畢達哥拉斯定理來計算點對之間的距離。

首先獲取坐標參考系 (CRS)

為此，您需要一個等距的 CRS。 這意味着，在感興趣的區域內，任何距離計算都會被保留。

假設您對計算美國的距離感興趣，您可以使用EPSG:102005 。 有關模式詳細信息，請參閱此 GIS 答案。 這里 CRS 的選擇至關重要，所以請確保你做對了，否則答案將是無稽之談。

應用於您的示例

crs.source = 4326
crs.dest = st_crs("+proj=eqdc +lat_0=39 +lon_0=-96 +lat_1=33 +lat_2=45 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs")

# coords1 and coords2 are matrixes with columns X and Y and rows of points in the `crs.dest` coordinate system.
coords1 <- table %>%
  st_as_sf(coords = c("lonCust","latCust"), crs = crs.source) %>%
  st_transform(crs.dest) %>%
  st_coordinates()
  
coords2 <- table %>%
  st_as_sf(coords = c("lonApp","latApp"), crs = crs.source) %>%
  st_transform(crs.dest) %>%
  st_coordinates()

# This is a vectorised computation, and so should be instant for a mere 25,000 rows :-)
table$distances = local({
  x_diff = coords1[, 'X'] - coords2[, 'X']
  y_diff = coords1[, 'Y'] - coords2[, 'Y']
  return(sqrt(x^2 + y^2))
})

如何測量單獨數據框中點之間的距離？

問題描述

2 個解決方案

解決方案1
1 已采納 2021-01-04 16:07:58

解決方案2
0 2022-01-01 21:28:46

超快速向量化計算

首先獲取坐標參考系 (CRS)

應用於您的示例

如何測量單獨數據框中點之間的距離？

問題描述

2 個解決方案

解決方案1 1 已采納 2021-01-04 16:07:58

解決方案2 0 2022-01-01 21:28:46

超快速向量化計算

首先獲取坐標參考系 (CRS)

應用於您的示例

解決方案1
1 已采納 2021-01-04 16:07:58

解決方案2
0 2022-01-01 21:28:46