简体   繁体   English

从纬度/经度坐标中找到5个最近的测站

[英]Find 5 closest stations from lat/lon coordinates

I'm trying to find the 5 closest stations from one data set ( set1 ) to another data set ( set2 ). 我正在尝试查找从一个数据集( set1 )到另一个数据集( set2 )的5个最近的站点。 This post is what I'm using as a basis and it seems simple to find the single closest, but I'm writing for loops to deal with it and is not efficient. 帖子是我用什么作为基础,它似乎简单,找单最接近,但我写for循环来对付它,是没有效率的。 Furthermore, I'm getting and error and don't understand why it's not working. 此外,我遇到了错误,并且不明白为什么它不起作用。 Ideally, I would like to use set1 to find the closest stations in set2 , find the 5 closest stations, and add a column for each station, for each unique id from set1 . 理想情况下,我想使用set1set2中找到最近的站点,找到5个最近的站点,并为set1每个唯一id为每个站点添加一列。

Edit: This question is different from How to assign a name to lat-long observations based on shortest distance because I'm trying to find the 5 closest stations, not just a single distance. 编辑:此问题与如何根据最短距离为纬度长的观测值分配名称不同,因为我正在尝试查找5个最近的站点,而不仅仅是一个距离。 Also, the method is different for finding the minimum. 同样,寻找最小值的方法也不同。 Please reopen this question. 请重新打开这个问题。

dput: dput:

set1 <- structure(list(id = c(5984, 7495, 4752, 2654, 4578, 9865, 3265, 
1252, 4679, 1346), lat = c(48.39167, 48.148056, 48.721111, 47.189167, 
47.054443, 47.129166, 47.306667, 47.84, 47.304167, 48.109444), 
    lon = c(13.671114, 12.866947, 15.94223, 11.099736, 12.958342, 
    14.203892, 11.86389, 16.526674, 16.193064, 17.071392)), row.names = c(NA, 
10L), class = "data.frame", .Names = c("id", "lat", "lon"))

set2 <- structure(list(id = 1:10, lat = structure(c(35.8499984741211, 
34.75, 70.9329986572266, 78.25, 69.6829986572266, 74.515998840332, 
70.3659973144531, 67.265998840332, 63.6990013122559, 60.1990013122559
), .Dim = 10L), lon = structure(c(14.4829998016357, 32.4000015258789, 
-8.66600036621094, 15.4670000076294, 18.9160003662109, 19.0160007476807, 
31.0990009307861, 14.3660001754761, 9.59899997711182, 11.0830001831055
), .Dim = 10L)), row.names = c(NA, 10L), class = "data.frame", .Names = c("id", 
"lat", "lon"))

Code: 码:

library(rgeos)
library(sp)


set1sp <- SpatialPoints(set1)
set2sp <- SpatialPoints(set2)
for (i in length(set1$id)){
  for (j in 4:9){
    if(i == 1) {
      sub <- set2
      set1[i,j] <- apply(gDistance(set1sp, set2sp, byid=TRUE), 1, which.min)
      sub <- filter(sub, id != set1[i,j])}
    else{
      set1[i,j] <- apply(gDistance(set1sp, set2sp, byid=TRUE), 1, which.min)
      sub <- filter(sub, id != set1[i,j])}
  }
}

Output error: 输出错误:

 Error in `[<-.data.frame`(`*tmp*`, i, j, value = c(8L, 8L, 8L, 8L, 8L,  : 
  replacement has 10 rows, data has 1 

I had to set the projection system and the coordinates for set1sp and set2sp in order to make gDistance work. 为了使gDistance工作,我必须设置投影系统和set1spset2sp的坐标。 I assumed WGS84. 我假设是WGS84。

dummyset1= set1
dummyset2= set2
coordinates(set1) = c('lon', 'lat')
coordinates(set2) = c('lon', 'lat')
proj4string(set1) = "+proj=longlat +datum=WGS84"
proj4string(set2) = "+proj=longlat +datum=WGS84"
set1sp = set1
set2sp = set2
set1 = dummyset1
set2 = dummyset2

This loop will return the output you wanted based on using the general structure of your for loop. 该循环将基于使用for循环的常规结构返回所需的输出。

for (i in 1:length(set1$id)){
    #Store the projected data in a dummy variable sub
    sub <- set2sp
    for (j in 4:8){
        if (j == 4){
           set1[i,j] <- apply(gDistance(set2sp['id'], set1sp['id'][i,], byid=TRUE), 1, which.min)
           #Remove the index of the closest point from sub.
           sub <- sub[which(sub$id != set1[i,j]), ]
        }
        else {
           #Note that sub is now being checked instead of set2sp. This is because sub has had the index of the closest point removed.
           set1[i,j] <- apply(gDistance(sub['id'], set1sp['id'][i,], byid=TRUE), 1, which.min)
           sub <- sub[which(sub$id != set1[i,j]), ]
        }
    }
}

The resulting output is: 结果输出为:

set1
   id      lat      lon V4 V5 V6 V7 V8
1  5984 48.39167 13.67111 10  1  8  7  6
2  7495 48.14806 12.86695 10  1  8  7  6
3  4752 48.72111 15.94223 10  1  8  7  6
4  2654 47.18917 11.09974  1  9  8  7  6
5  4578 47.05444 12.95834  1  9  8  7  6
6  9865 47.12917 14.20389  1  9  8  7  6
7  3265 47.30667 11.86389  1  9  8  7  6
8  1252 47.84000 16.52667  1  9  8  7  6
9  4679 47.30417 16.19306  1  9  8  7  6
10 1346 48.10944 17.07139  1  9  8  7  6

The following computes great circle distances from all points in set 2 wrt set 1. It then takes the minimum over set 1, and orders them; 下面的代码计算从集合2到集合1的所有点的大圆距离。然后取集合1上的最小值,并对它们进行排序; then plots. 然后情节。

library(sp)
coordinates(set1) = c('lon', 'lat')
coordinates(set2) = c('lon', 'lat')
proj4string(set1) = "+proj=longlat +datum=WGS84"
proj4string(set2) = "+proj=longlat +datum=WGS84"
d = apply(spDists(set1,set2),2,min)
order(d)[1:5]
# [1]  1 10  9  2  8
plot(set2, pch=2, axes=TRUE)
points(set1)
o = order(d)[1:5]
points(set2[o,], col = 'red', pch=16)

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM