以更有效的方式使用 geosphere::distm？

Question

使用商店的位置數據，我試圖找到“競爭對手”——它被定義為一定距離內的其他商店。

我正在使用geo sphere::distm和一些矩陣運算，如下所示。 問題是我的矩陣很大（100,000 X 100,000）並且需要很長時間（或者我的記憶不支持這種類型的分析）。 有沒有辦法讓下面的代碼更有效率？ 輸入文件看起來就像locations_data （但更大）。 所需的輸出是數據表competitors ，其中每一行包含成對的競爭者。 我是用 R 編寫高效代碼的新手，想尋求幫助。

locations_data<-cbind(id=1:100, longitude=runif(100,min=-180, max=-120), latitude=runif(100, min=50, max=85))

#require(geosphere)
mymatrix<-distm(locations_data[,2:3])

#require(data.table)
analyze_competitors<-function(mymatrix){
    mymatrix2<-matrix(as.numeric(mymatrix<1000000), nrow(mymatrix), ncol(mymatrix)) #
    competitors<-which(mymatrix2==1,arr.ind = T)
    competitors<-data.table(competitors)
    return(competitors)
}

competitors<-analyze_competitors(mymatrix)

Answer 1

如果您想要較小的矩陣，請考慮使用基於經度和/或緯度的網格拆分數據。 例如，這會生成兩個帶有 5 x 5 網格標簽的新列。

#converting your example data to a tibble.
locations_data<-tibble::as_tibble(locations_data)
#create a numeric grid spanning the extent of your latitude and longitude
locations_data$long_quant<-findInterval(locations_data$longitude, quantile(locations_data$longitude,probs = seq(0,1,.2)), rightmost.closed=TRUE)
locations_data$lat_quant<-findInterval(locations_data$latitude, quantile(locations_data$latitude,probs = seq(0,1,.2)), rightmost.closed=TRUE)

然后，您可以使用locations_data 的子集創建多個較小的矩陣。

以更有效的方式使用 geosphere::distm？

問題描述

1 個解決方案

解決方案1
1 已采納 2020-01-18 00:55:35

以更有效的方式使用 geosphere::distm？

問題描述

1 個解決方案

解決方案1 1 已采納 2020-01-18 00:55:35

解決方案1
1 已采納 2020-01-18 00:55:35