简体   繁体   English

使用 sf 通过两个数据帧之间的不同年份计算最近点坐标和距离

[英]Calculate nearest point coordinate and distance by differing years between two data frames using sf

For each observation in a data frame in a year I am trying to finde the nearest observation in another data frame one year prior and calculate their distance.对于一年中数据框中的每个观察值,我试图在一年前的另一个数据框中找到最近的观察值并计算它们的距离。

Following this ( https://gis.stackexchange.com/questions/349955/getting-a-new-column-with-distance-to-the-nearest-feature-in-r ) approach, I wrote the following code:按照这种( https://gis.stackexchange.com/questions/349955/getting-a-new-column-with-distance-to-the-nearest-feature-in-r )方法,我编写了以下代码:

for(x in 2000:2020) {
  R36_loc$nearest <- st_nearest_points(
    R36_loc %>% ungroup() %>% filter(year == x),
    mining_loc %>% ungroup() %>% filter(year == x - 1)
  )
}
R36_loc$dist_near_mine = st_distance(R36_loc, mining_loc[nearest,], by_element=TRUE)

My data looks like this: mining_loc :我的数据如下所示: mining_loc

structure(list(year = structure(c(2009, 2007, 2008, 2009, 2007, 
2007, 2009, 2008, 2010, 2008, 2011, 2002, 2012, 2012, 2009, 2010, 
2012, 2006, 2014, 2013, 2008, 2010, 2006, 2011, 2004, 2006, 2011, 
2012, 2014, 2005), label = "year", format.stata = "%10.0g"), 
    geometry = structure(list(structure(c(29.6789, -3.5736), class = c("XY", 
    "POINT", "sfg")), structure(c(29.146988, -26.09538), class = c("XY", 
    "POINT", "sfg")), structure(c(0.089167, 35.93111), class = c("XY", 
    "POINT", "sfg")), structure(c(29.915396, -20.535308), class = c("XY", 
    "POINT", "sfg")), structure(c(28.01295, -26.22712), class = c("XY", 
    "POINT", "sfg")), structure(c(-8.88214, 31.86011), class = c("XY", 
    "POINT", "sfg")), structure(c(6.475727, 30.66071), class = c("XY", 
    "POINT", "sfg")), structure(c(-2.04396, 5.243666), class = c("XY", 
    "POINT", "sfg")), structure(c(27.702666, -21.358855), class = c("XY", 
    "POINT", "sfg")), structure(c(48.650001, -16.176654), class = c("XY", 
    "POINT", "sfg")), structure(c(33.23611, 28.59167), class = c("XY", 
    "POINT", "sfg")), structure(c(30.945726, -22.507772), class = c("XY", 
    "POINT", "sfg")), structure(c(22.90999, -27.175352), class = c("XY", 
    "POINT", "sfg")), structure(c(10.44916725, 35.54916763), class = c("XY", 
    "POINT", "sfg")), structure(c(-12.136052, 7.765232), class = c("XY", 
    "POINT", "sfg")), structure(c(32.89942, 24.09082), class = c("XY", 
    "POINT", "sfg")), structure(c(28.58115, -25.256046), class = c("XY", 
    "POINT", "sfg")), structure(c(31.673825, -28.221349), class = c("XY", 
    "POINT", "sfg")), structure(c(12.916667, 18.683333), class = c("XY", 
    "POINT", "sfg")), structure(c(8.915834, 33.53159), class = c("XY", 
    "POINT", "sfg")), structure(c(17.71667, -19.21667), class = c("XY", 
    "POINT", "sfg")), structure(c(27.88332939, -12.46667004), class = c("XY", 
    "POINT", "sfg")), structure(c(33.98638, 17.70217), class = c("XY", 
    "POINT", "sfg")), structure(c(27.302793, -25.65206), class = c("XY", 
    "POINT", "sfg")), structure(c(-8.10837, 6.87479), class = c("XY", 
    "POINT", "sfg")), structure(c(-5.03293, 31.50764), class = c("XY", 
    "POINT", "sfg")), structure(c(38.66667, -3.81667), class = c("XY", 
    "POINT", "sfg")), structure(c(27.191434, -27.390284), class = c("XY", 
    "POINT", "sfg")), structure(c(31.924721, -28.841876), class = c("XY", 
    "POINT", "sfg")), structure(c(-10.7299, 11.32676), class = c("XY", 
    "POINT", "sfg"))), class = c("sfc_POINT", "sfc"), precision = 0, bbox = structure(c(xmin = -12.136052, 
    ymin = -28.841876, xmax = 48.650001, ymax = 35.93111), class = "bbox"), crs = structure(list(
        input = "EPSG:4326", wkt = "GEOGCRS[\"WGS 84\",\n    ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n        MEMBER[\"World Geodetic System 1984 (Transit)\"],\n        MEMBER[\"World Geodetic System 1984 (G730)\"],\n        MEMBER[\"World Geodetic System 1984 (G873)\"],\n        MEMBER[\"World Geodetic System 1984 (G1150)\"],\n        MEMBER[\"World Geodetic System 1984 (G1674)\"],\n        MEMBER[\"World Geodetic System 1984 (G1762)\"],\n        MEMBER[\"World Geodetic System 1984 (G2139)\"],\n        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n            LENGTHUNIT[\"metre\",1]],\n        ENSEMBLEACCURACY[2.0]],\n    PRIMEM[\"Greenwich\",0,\n        ANGLEUNIT[\"degree\",0.0174532925199433]],\n    CS[ellipsoidal,2],\n        AXIS[\"geodetic latitude (Lat)\",north,\n            ORDER[1],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        AXIS[\"geodetic longitude (Lon)\",east,\n            ORDER[2],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n    USAGE[\n        SCOPE[\"Horizontal component of 3D system.\"],\n        AREA[\"World.\"],\n        BBOX[-90,-180,90,180]],\n    ID[\"EPSG\",4326]]"), class = "crs"), n_empty = 0L)), class = c("sf", 
"grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-30L), groups = structure(list(year = structure(c(2002, 2004, 
2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014), label = "year", format.stata = "%10.0g"), 
    .rows = structure(list(12L, 25L, 30L, c(18L, 23L, 26L), c(2L, 
    5L, 6L), c(3L, 8L, 10L, 21L), c(1L, 4L, 7L, 15L), c(9L, 16L, 
    22L), c(11L, 24L, 27L), c(13L, 14L, 17L, 28L), 20L, c(19L, 
    29L)), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", 
    "list"))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-12L), .drop = TRUE), sf_column = "geometry", agr = structure(c(year = NA_integer_), levels = c("constant", 
"aggregate", "identity"), class = "factor"))

and R36_loc :R36_loc

structure(list(year = c(2012, 2013, 2008, 2005, 2012, 2013, 2005, 
2013, 2008, 2005, 2012, 2012, 2008, 2005, 2005, 2009, 2008, 2012, 
2005, 2006, 2012, 2005, 2008, 2012, 2012, 2005, 2008, 2008, 2008, 
2005), geometry = structure(list(structure(c(29.17557, -21.20929
), class = c("XY", "POINT", "sfg")), structure(c(-13.75231, 9.4795399
), class = c("XY", "POINT", "sfg")), structure(c(-8.5474997, 
6.82056), class = c("XY", "POINT", "sfg")), structure(c(-23.522779, 
14.91389), class = c("XY", "POINT", "sfg")), structure(c(-2.64236, 
7.8043299), class = c("XY", "POINT", "sfg")), structure(c(40.041, 
-0.17200001), class = c("XY", "POINT", "sfg")), structure(c(33.48946, 
-9.1142197), class = c("XY", "POINT", "sfg")), structure(c(-7.07623, 
4.6770301), class = c("XY", "POINT", "sfg")), structure(c(34.116669, 
-14.15), class = c("XY", "POINT", "sfg")), structure(c(35.650669, 
-15.80635), class = c("XY", "POINT", "sfg")), structure(c(-11.01406, 
6.6858401), class = c("XY", "POINT", "sfg")), structure(c(34.030159, 
0.84144002), class = c("XY", "POINT", "sfg")), structure(c(34.191002, 
1.016), class = c("XY", "POINT", "sfg")), structure(c(37.385761, 
-1.94943), class = c("XY", "POINT", "sfg")), structure(c(2.23564, 
7.8688698), class = c("XY", "POINT", "sfg")), structure(c(29.5, 
-18.75), class = c("XY", "POINT", "sfg")), structure(c(36.803509, 
-14.32926), class = c("XY", "POINT", "sfg")), structure(c(25.883329, 
-24.48333), class = c("XY", "POINT", "sfg")), structure(c(26.987329, 
-16.688841), class = c("XY", "POINT", "sfg")), structure(c(25.636339, 
-33.974258), class = c("XY", "POINT", "sfg")), structure(c(-11.133, 
6.8152399), class = c("XY", "POINT", "sfg")), structure(c(35.416672, 
-4.1500001), class = c("XY", "POINT", "sfg")), structure(c(28.75, 
-30), class = c("XY", "POINT", "sfg")), structure(c(57.633331, 
-20.41667), class = c("XY", "POINT", "sfg")), structure(c(33.5, 
-3.6666701), class = c("XY", "POINT", "sfg")), structure(c(35.27496, 
-0.56010997), class = c("XY", "POINT", "sfg")), structure(c(3.30757, 
6.63937), class = c("XY", "POINT", "sfg")), structure(c(-13.647, 
13.605), class = c("XY", "POINT", "sfg")), structure(c(32.209759, 
-2.80952), class = c("XY", "POINT", "sfg")), structure(c(36.71236, 
1.78276), class = c("XY", "POINT", "sfg"))), class = c("sfc_POINT", 
"sfc"), precision = 0, bbox = structure(c(xmin = -23.522779, 
ymin = -33.974258, xmax = 57.633331, ymax = 14.91389), class = "bbox"), crs = structure(list(
    input = "EPSG:4326", wkt = "GEOGCRS[\"WGS 84\",\n    ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n        MEMBER[\"World Geodetic System 1984 (Transit)\"],\n        MEMBER[\"World Geodetic System 1984 (G730)\"],\n        MEMBER[\"World Geodetic System 1984 (G873)\"],\n        MEMBER[\"World Geodetic System 1984 (G1150)\"],\n        MEMBER[\"World Geodetic System 1984 (G1674)\"],\n        MEMBER[\"World Geodetic System 1984 (G1762)\"],\n        MEMBER[\"World Geodetic System 1984 (G2139)\"],\n        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n            LENGTHUNIT[\"metre\",1]],\n        ENSEMBLEACCURACY[2.0]],\n    PRIMEM[\"Greenwich\",0,\n        ANGLEUNIT[\"degree\",0.0174532925199433]],\n    CS[ellipsoidal,2],\n        AXIS[\"geodetic latitude (Lat)\",north,\n            ORDER[1],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        AXIS[\"geodetic longitude (Lon)\",east,\n            ORDER[2],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n    USAGE[\n        SCOPE[\"Horizontal component of 3D system.\"],\n        AREA[\"World.\"],\n        BBOX[-90,-180,90,180]],\n    ID[\"EPSG\",4326]]"), class = "crs"), n_empty = 0L)), row.names = c(NA, 
-30L), class = c("sf", "tbl_df", "tbl", "data.frame"), sf_column = "geometry", agr = structure(c(year = NA_integer_), levels = c("constant", 
"aggregate", "identity"), class = "factor"))

Each observation from R36_loc should show the distance to the nearest observation in mining_loc on year prior in a new variable.来自R36_loc的每个观测值都应在新变量中显示到前一年mining_loc中最近观测值的距离。

The first error I get, I think, is due to some years not having any observations (Error in UseMethod("st_as_sfc"): no applicable method for 'st_as_sfc' applied to an object of class "NULL").我认为,我得到的第一个错误是由于几年没有任何观察结果(UseMethod 中的错误(“st_as_sfc”):没有适用于“st_as_sfc”的方法应用于 class“NULL”的 object)。

When I only loop through existing years I get当我只遍历现有年份时,我得到

Error:
! Assigned data `value` must be compatible with existing data.
✖ Existing data has 7207 rows.
✖ Assigned data has 352800 rows.
ℹ Only vectors of size 1 are recycled.
Backtrace:
  1. base::`$<-`(`*tmp*`, nearest, value = `<GEOMETRY [°]>`)
 19. tibble (local) `<fn>`(`<vctrs___>`)"

I found a way to do this using the RANN package.我找到了一种使用RANN package 的方法。 I start by extracting the geometry to long and lat columns and converting my data frame to a list of data frames by year:我首先将几何提取到 long 和 lat 列,然后按年份将我的数据框转换为数据框列表:

R36_loc2 <- R36_loc %>% ungroup() %>% mutate(long = unlist(map(.$geometry,1)),
           lat = unlist(map(.$geometry,2)))
st_geometry(R36_loc2) <- NULL

AB_by_year <- split(R36_loc2, f = R36_loc$year)

Since, for the second data frame, I need the observations from a year prior, I create a new year variable merge_year and also transform the data into a list by the new variable:因为,对于第二个数据框,我需要前一年的观察结果,所以我创建了一个新的年份变量merge_year并将数据转换为新变量的列表:

mining_loc$merge_year <- mining_loc$year - 1
# make list of data by merging year
mining_by_year <- split(mining_loc, f = mining_loc$merge_year)
# make ID var
mining_by_year <- mining_by_year %>% lapply(function(x) {x %>% rowid_to_column("ID")})

I then loop through the years and look for closest mine to each observation in each year - merge_year - combination, then add two new columns [, c(43,44)] to each year data frame in the AB-list of data frames.然后我遍历这些年并寻找最接近每年每个观察的我的 - merge_year - 组合,然后将两个新列[, c(43,44)]添加到 AB 数据帧列表中的每个年份数据帧。 The two columns will indicate the ID of closest mine to each observation in the corresponding year-dataframe in the mining_list, called nn.idx , and the distance, called nn.dists .这两列将指示与 mining_list 中相应年份数据框中每个观察值最近的矿井的 ID,称为nn.idx和距离,称为nn.dists

for(x in wave_years) {
  AB_by_year[[as.character(x)]][ , c(43,44)] <- as.data.frame(RANN::nn2(mining_by_year[[as.character(x)]][,c("lat", "long")], AB_by_year[[as.character(x)]][,c("lat", "long")], k=1)
  )
}

I then check if it worked, by creating maps that connect the observations to the mines.然后我通过创建将观察结果与地雷连接起来的地图来检查它是否有效。

I first create a list for the lines to nearest mine我首先为离我最近的线路创建一个列表

lines_list <- vector(mode = "list", length = length(wave_years))
names(lines_list) <- wave_years

I joint the observations with each nearest mine coordinates我将观察结果与每个最近的地雷坐标结合起来

for(x in wave_years) {
  lines_list[[as.character(x)]] <- left_join(AB_by_year[[as.character(x)]], mining_by_year[[as.character(x)]], by = c("nn.idx" = "ID"))
}

I then need to convert the list back to a data frame:然后我需要将列表转换回数据框:

lines <- do.call(rbind.data.frame, lines_list) 

and now I follow the approach of: Connecting two sets of coordinates to create lines using sf/mapview现在我遵循以下方法: 使用 sf/mapview 连接两组坐标以创建线条

b = lines[, c("long.x", "lat.x")]
names(b) = c("long", "lat")
e = lines[, c("long.y", "lat.y")]
names(e) = c("long", "lat")

lines$geometry = do.call(
  "c", 
  lapply(seq(nrow(b)), function(i) {
    st_sfc(
      st_linestring(
        as.matrix(
          rbind(b[i, ], e[i, ])
        )
      ),
      crs = 4326
    )
  }))

Finally, I want to show graphically, that the code worked by first converting the data into sf-objects最后,我想以图形方式显示,代码通过首先将数据转换为 sf 对象来工作

mining_loc_geo <- st_as_sf(mining_loc, coords = c("long", "lat"), crs = 4326)
R36_loc_geo <- st_as_sf(R36_loc, coords = c("long", "lat"), crs = 4326)

and then plotting them with ggplot.然后用ggplot绘制它们。

ggplot() + geom_sf(data = boundaries_africa3, aes()) + geom_sf(data = R36_loc_geo %>% filter(year == 2005), color = "blue", aes(geometry = geometry)) + geom_sf(data = mining_loc_geo %>% filter(merge_year == 2005), color = "red", aes(geometry = geometry)) + geom_sf(data = lines %>% filter(year.x == 2005), aes(geometry = geometry))

The object boundaries_africa3 is an underlying map. object boundaries_africa3是底层 map。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM