[英]How can I calculate distance between multiple latitude and longitude data?
I have 1100 station location (latitude and longitude) data and 10000 house location (latitude and longitude) data.我有 1100 个站点位置(经纬度)数据和 10000 个房屋位置(经纬度)数据。 Is it possible to calculate the lowest distance between station and house for each house by using R codes?
是否可以使用R代码计算每个房屋的车站和房屋之间的最低距离? I also want the station that gives the lowest distance for each house.
我还想要为每个房屋提供最短距离的车站。 Is it possible?
是否有可能?
Here's a toy example for finding mass distances between m
points and n
cities.这是一个用于查找
m
个点和n
个城市之间的质量距离的玩具示例。 It should translate directly to your station/house problem.它应该直接转化为您的车站/房屋问题。
I brought up worldcities, spun the globe (so to speak), and stopped on four cities.我提出了世界城市,旋转地球(可以这么说),并在四个城市停留。 I then spun again and stopped at two points.
然后我再次旋转并停在两点。 The two counts here are immaterial: if we have 4 and 2 or 1100 and 10000, it should not matter much.
这里的两个计数无关紧要:如果我们有 4 和 2 或 1100 和 10000,那应该没什么关系。
worldcities <- read.csv(header = TRUE, stringsAsFactors = FALSE, text = "
lat,lon
39.7642548,-104.9951942
48.8588377,2.2770206
26.9840891,49.4080842
13.7245601,100.493026")
coords <- read.csv(header = TRUE, stringsAsFactors = FALSE, text = "
lat,lon
27.9519571,66.8681431
40.5351151,-108.4939948")
(A quick note ... often, tools give us coordinates in "latitude, longitude", at least in my experience. geosphere
functions, however, assumes "longitude, latitude". So my coordinates above were copied straight from random views in google maps, and I didn't want to edit them; because of this, I reverse the columns below with [,2:1]
column indexing. If you forget and give coordinates that are undeniably not correct, you'll get the error Error in .pointsToMatrix(p1) : latitude < -90
, which should be a prod that you have likely reversed the order of your coordinates. At which point you scratch your head and wonder if all of your other projects have used the wrong coordinates, calling into question your conclusions. Not me, I've never been there. This year.) (快速说明......通常,工具会以“纬度,经度”为我们提供坐标,至少在我的经验中。然而,地
geosphere
函数假定“经度,纬度”。所以我上面的坐标直接从谷歌的随机视图中复制地图,我不想编辑它们;因此,我用[,2:1]
列索引反转了下面的列。如果您忘记并给出不可否认的不正确坐标,您将收到错误Error in .pointsToMatrix(p1) : latitude < -90
, 这应该是一个刺激,你可能已经颠倒了坐标的顺序。此时你挠头,想知道是否所有其他项目都使用了错误的坐标,调用质疑你的结论。不是我,我从来没有去过那里。今年。)
Let's find the distance in meters between each of coords
(each row) and each city (each column):让我们找出每个
coords
(每行)和每个城市(每列)之间的距离(以米为单位):
dists <- outer(seq_len(nrow(coords)), seq_len(nrow(worldcities)),
function(i, j) geosphere::distHaversine(coords[i,2:1], worldcities[j,2:1]))
dists
# [,1] [,2] [,3] [,4]
# [1,] 12452329.0 5895577 1726433 3822220
# [2,] 309802.8 7994185 12181477 13296825
It should be straight-forward to find which city is closest to each coordinate, with应该直接找到最接近每个坐标的城市,
apply(dists, 1, which.min)
# [1] 3 1
That is, the first point is closest to the third city, and the second point is closest to the first city.即第一个点离第三个城市最近,第二个点离第一个城市最近。
Just to prove this is a tenable solution for a large number pairs, here's the same problem scaled up a bit.只是为了证明这是一个适用于大量对的可行解决方案,这里稍微放大了相同的问题。
worldcities_big <- do.call(rbind, replicate(250, worldcities, simplify = FALSE))
nrow(worldcities_big)
# [1] 1000
coords_big <- do.call(rbind, replicate(5000, coords, simplify = FALSE))
nrow(coords_big)
# [1] 10000
system.time(
dists <- outer(seq_len(nrow(coords_big)), seq_len(nrow(worldcities_big)),
function(i, j) geosphere::distHaversine(coords_big[i,2:1], worldcities_big[j,2:1]))
)
# user system elapsed
# 67.62 2.22 70.03
So yes , it was not instantaneous, but 70 seconds is not horrible for 10,000,000 distance calculations.所以是的,它不是瞬时的,但 70 秒对于 10,000,000 次距离计算来说并不可怕。 Could you make it faster?
你能让它更快吗? Perhaps, not sure precisely how, easily .
也许,不知道究竟如何,很容易。 I'd think some heuristics might reduce it to
O(m*log(n))
from O(m*n)
time, but I don't know if that's worth the coding complexity it'll introduce.我认为一些启发式方法可能会将其从
O(m*n)
时间减少到O(m*log(n))
,但我不知道这是否值得它引入的编码复杂性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.