![](/img/trans.png)
[英]Return minimum distance between each row and each column of two long lat coordinates in two dataframes
[英]Minimum Distance between lat long across multiple data frames
我有一個名為 A 的數據框,它在單獨的列中有緯度和經度。 樣本數據
ID Lat Long
a 10.773046 76.6392061
b 10.7751978 76.6368363
c 12.954027 78.988818
d 12.9608638 77.521573
我有一個名為 Test 的數據框,它在單獨的列中有 lat long。 樣本數據
Store Lat Long
a 21.244769 81.63861
b 9.919337 78.14844
c 10.053961 76.32757
d 13.829922 77.49369
e 23.849729 77.93647
我想在每個 ID 上運行一個循環,以找到從他的 lat long 和 store 的 lat long 到最近的商店的最小距離。 所以 ID a 將檢查 a,b,c,d 和 e 並找到最近的商店。
目標 - 找到最小距離和商店名稱。
Output 應該告訴我
Id Lat Long Store Distance
a 10.773046 76.6392061 b 50ms
a$Distance <- NA # Make an "empty" variable in my data.frame
myFunction <- function(x, y){
distm(c(lon1, lat1), c(lon2, lat2), fun = distHaversine)
}
for(ii in a){
for(jj in Test){
tempX <- a[a$Lat == ii & Store$Lat== jj, c("Lat")]
tempY <- a[a$Long == ii & Store$Long == jj, c("Long")]
# "Save" results into appropriate location in my data.frame
myFunction(tempX,tempY)
}
}
我無法獲得確切的 output。
你可以看看這個
a <- data.frame(ID = c("a", "b", "c", "d"), Lat = c(10.773046, 10.7751978, 12.954027, 12.9608638),
Long = c(76.6392061, 76.6392061, 78.988818, 77.521573))
b <- data.frame(Store = c("a", "b", "c", "d", "e"), Lat = c(21.244769, 9.919337, 10.053961, 13.829922, 23.849729),
Long = c(81.63861, 78.14844, 76.32757, 77.49369, 77.93647))
library(tidyverse)
earth.dist <- function (long1, lat1, long2, lat2)
{
rad <- pi/180
a1 <- lat1 * rad
a2 <- long1 * rad
b1 <- lat2 * rad
b2 <- long2 * rad
dlon <- b2 - a2
dlat <- b1 - a1
a <- (sin(dlat/2))^2 + cos(a1) * cos(b1) * (sin(dlon/2))^2
c <- 2 * atan2(sqrt(a), sqrt(1 - a))
R <- 6378.145
d <- R * c
return(d)
}
a1 <- a %>%
group_by(ID, Lat, Long) %>%
summarise(closest = which.min(abs(Lat - b$Lat) + abs(Long - b$Long))) %>%
mutate(Store = b$Store[closest],
Distance = sqrt((Lat - b$Lat[closest])^2 + (Long - b$Long[closest])^2),
distKm = earth.dist(Lat, Long, b$Lat[closest],b$Long[closest]))
結果是:
a1
ID Lat Long closest Store Distance distKm
<fct> <dbl> <dbl> <int> <fct> <dbl> <dbl>
1 a 10.8 76.6 3 c 0.784 39.4
2 b 10.8 76.6 3 c 0.786 39.4
3 c 13.0 79.0 4 d 1.73 168.
4 d 13.0 77.5 4 d 0.870 21.2
這是使用 geosphere 庫的解決方案,它以米為單位計算距離(腳本確實轉換為 km)。 如果您的數據集大小合理(即 < 50,000),則性能是可以接受的。
a <- data.frame(ID = c("a", "b", "c", "d"), Lat = c(10.773046, 10.7751978, 12.954027, 12.9608638),
Long = c(76.6392061, 76.6392061, 78.988818, 77.521573))
b <- data.frame(Store = c("a", "b", "c", "d", "e"), Lat = c(21.244769, 9.919337, 10.053961, 13.829922, 23.849729),
Long = c(81.63861, 78.14844, 76.32757, 77.49369, 77.93647))
library(geosphere)
#calculate the distance matrix
distmatrix<-distm(a[, c(3, 2)], b[,c(3, 2)])
#find closest column and get distance
closest<-apply(distmatrix, 1, which.min)
a$store<-as.character(b$Store[closest])
a$distance<-apply(distmatrix, 1, min)/1000
a
ID Lat Long store distance
1 a 10.77305 76.63921 c 86.54914
2 b 10.77520 76.63921 c 86.76789
3 c 12.95403 78.98882 d 188.71751
4 d 12.96086 77.52157 d 96.19473
解決方案基於一個類似的問題: 是否有一種有效的方法可以根據經度和緯度對附近的位置進行分組?
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.