简体   繁体   English

如何计算多个经纬度数据之间的距离?

[英]How can I calculate distance between multiple latitude and longitude data?

I have 1100 station location (latitude and longitude) data and 10000 house location (latitude and longitude) data.我有 1100 个站点位置(经纬度)数据和 10000 个房屋位置(经纬度)数据。 Is it possible to calculate the lowest distance between station and house for each house by using R codes?是否可以使用R代码计算每个房屋的车站和房屋之间的最低距离? I also want the station that gives the lowest distance for each house.我还想要为每个房屋提供最短距离的车站。 Is it possible?是否有可能?

Here's a toy example for finding mass distances between m points and n cities.这是一个用于查找m个点和n个城市之间的质量距离的玩具示例。 It should translate directly to your station/house problem.它应该直接转化为您的车站/房屋问题。

I brought up worldcities, spun the globe (so to speak), and stopped on four cities.我提出了世界城市,旋转地球(可以这么说),并在四个城市停留。 I then spun again and stopped at two points.然后我再次旋转并停在两点。 The two counts here are immaterial: if we have 4 and 2 or 1100 and 10000, it should not matter much.这里的两个计数无关紧要:如果我们有 4 和 2 或 1100 和 10000,那应该没什么关系。

worldcities <- read.csv(header = TRUE, stringsAsFactors = FALSE, text = "
lat,lon
39.7642548,-104.9951942
48.8588377,2.2770206
26.9840891,49.4080842
13.7245601,100.493026")

coords <- read.csv(header = TRUE, stringsAsFactors = FALSE, text = "
lat,lon
27.9519571,66.8681431
40.5351151,-108.4939948")

(A quick note ... often, tools give us coordinates in "latitude, longitude", at least in my experience. geosphere functions, however, assumes "longitude, latitude". So my coordinates above were copied straight from random views in google maps, and I didn't want to edit them; because of this, I reverse the columns below with [,2:1] column indexing. If you forget and give coordinates that are undeniably not correct, you'll get the error Error in .pointsToMatrix(p1) : latitude < -90 , which should be a prod that you have likely reversed the order of your coordinates. At which point you scratch your head and wonder if all of your other projects have used the wrong coordinates, calling into question your conclusions. Not me, I've never been there. This year.) (快速说明......通常,工具会以“纬度,经度”为我们提供坐标,至少在我的经验中。然而,地geosphere函数假定“经度,纬度”。所以我上面的坐标直接从谷歌的随机视图中复制地图,我不想编辑它们;因此,我用[,2:1]列索引反转了下面的列。如果您忘记并给出不可否认的不正确坐标,您将收到错误Error in .pointsToMatrix(p1) : latitude < -90 , 这应该是一个刺激,你可能已经颠倒了坐标的顺序。此时你挠头,想知道是否所有其他项目都使用了错误的坐标,调用质疑你的结论。不是我,我从来没有去过那里。今年。)

Let's find the distance in meters between each of coords (each row) and each city (each column):让我们找出每个coords (每行)和每个城市(每列)之间的距离(以米为单位):

dists <- outer(seq_len(nrow(coords)), seq_len(nrow(worldcities)),
               function(i, j) geosphere::distHaversine(coords[i,2:1], worldcities[j,2:1]))
dists
#            [,1]    [,2]     [,3]     [,4]
# [1,] 12452329.0 5895577  1726433  3822220
# [2,]   309802.8 7994185 12181477 13296825

It should be straight-forward to find which city is closest to each coordinate, with应该直接找到最接近每个坐标的城市,

apply(dists, 1, which.min)
# [1] 3 1

That is, the first point is closest to the third city, and the second point is closest to the first city.即第一个点离第三个城市最近,第二个点离第一个城市最近。

Just to prove this is a tenable solution for a large number pairs, here's the same problem scaled up a bit.只是为了证明这是一个适用于大量对的可行解决方案,这里稍微放大了相同的问题。

worldcities_big <- do.call(rbind, replicate(250, worldcities, simplify = FALSE))
nrow(worldcities_big)
# [1] 1000
coords_big <- do.call(rbind, replicate(5000, coords, simplify = FALSE))
nrow(coords_big)
# [1] 10000
system.time(
  dists <- outer(seq_len(nrow(coords_big)), seq_len(nrow(worldcities_big)),
                 function(i, j) geosphere::distHaversine(coords_big[i,2:1], worldcities_big[j,2:1]))
)
#    user  system elapsed 
#   67.62    2.22   70.03 

So yes , it was not instantaneous, but 70 seconds is not horrible for 10,000,000 distance calculations.所以是的,它不是瞬时的,但 70 秒对于 10,000,000 次距离计算来说并不可怕。 Could you make it faster?你能让它更快吗? Perhaps, not sure precisely how, easily .也许,不知道究竟如何,很容易 I'd think some heuristics might reduce it to O(m*log(n)) from O(m*n) time, but I don't know if that's worth the coding complexity it'll introduce.我认为一些启发式方法可能会将其从O(m*n)时间减少到O(m*log(n)) ,但我不知道这是否值得它引入的编码复杂性。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算数据帧R中多个距离经度纬度 - Calculate distance longitude latitude of multiple in dataframe R R代码计算多个经纬度对之间的距离并提取最接近的对 - R code to calculate distance between multiple latitude-longitude pairs and extract the closest pairs 如何在短时间内计算一个数据集中的经纬度点与另一个数据集中的经纬度点之间的最短距离 - How to calculate shortest distance between longitude-latitude points in one dataset with those in another in a short time 经度/纬度点之间的最大距离 - Greatest distance between set of longitude/latitude points 获取经度点和纬度点向量之间的距离 - Getting distance between vectors of longitude and latitude points 在 BigQuery 中本地使用经度和纬度计算起点和目的地之间的行驶距离? - Calculate driving distance between origin and destination using longitude and latitude natively in BigQuery? 使用具有循环的不同长度的不同数据帧中的纬度和经度数据计算距离 - Calculate Distance using Latitude and Longitude data in Different Data frames of different lengths with loop R根据两个数据帧的经纬度计算距离 - R calculate distance based on latitude-longitude from two data frames R 使用 18k 行数据框中的 2 个纬度和 2 个经度向量计算距离(以英里为单位) - R calculate distance in miles using 2 latitude & 2 longitude vectors in a data frame for 18k rows 计算大型数据集的经度和纬度之间的距离 - Calculate distances between longitude and latitude for a large datasets
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM