简体   繁体   English

计算数据帧中多个经纬度点的中心点

[英]Calculate a centre point of multiple lat, long points in a data-frame

I have a dataset that looks like this:我有一个如下所示的数据集:

site   lat      long 
bras2  41.21   -115.11
tex4   45.3    -112.31
bras2  41.15   -115.15 
bras2  41.12   -115.19

For samples with the same site name, I want to calculate their centre point and then add it as a column to the dataset.对于具有相同site名称的样本,我想计算它们的中心点,然后将其作为列添加到数据集中。 Some site names are duplicated twice, other three times, other four times.一些site名称重复两次,其他三次,其他四次。

Like this:像这样:

site   lat      long    centre_lat  centre_long 
bras2  41.21   -115.11  value here     value here
tex4   45.3    -112.31  45.3           -112.31 
bras2  41.15   -115.15  value here     value here
bras2  41.12   -115.19  value here     value here

How can I do this?我怎样才能做到这一点?

If you're using spatial data, you should look into using the sf package.如果您使用的是空间数据,您应该考虑使用sf package。 It handles geometries and functions for operating on them well.它可以很好地处理几何图形和函数。

Code below shows using both sf::st_centroid and geosphere::centroid .下面的代码显示了同时使用sf::st_centroid和 geosphere geosphere::centroid I prefer sf 's way of doing things.我更喜欢sf的做事方式。

df <- read.table(header=TRUE, text= "site   lat      long 
bras2  41.21   -115.11
tex4   45.3    -112.31
bras2  41.15   -115.15 
bras2  41.12   -115.19")


library(dplyr)
library(geosphere)
library(sf)

# Using sf's st_centroid
df_sf <- st_as_sf(df, coords = c('long', 'lat'))

centroids_sf <- df_sf %>%
  group_by(site) %>% 
  summarize(geometry = st_union(geometry)) %>% 
  st_centroid

  
# Using geosphere::centroid
centroids_geoshpere <- df_sf %>%
  group_by(site) %>%
  filter(n() >2)  %>% ## geosphere needs polygons therefore 3+ points
  st_union() %>%
  st_cast('POLYGON') %>%
  as('Spatial') %>% # geoshpere expects SpatialPolygons objects
  centroid() 
  

centroids_geoshpere
#>         [,1]     [,2]
#> [1,] -115.15 41.16001
centroids_sf
#> Simple feature collection with 2 features and 1 field
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: -115.15 ymin: 41.16 xmax: -112.31 ymax: 45.3
#> CRS:            NA
#> # A tibble: 2 x 2
#>   site         geometry
#> * <chr>         <POINT>
#> 1 bras2 (-115.15 41.16)
#> 2 tex4   (-112.31 45.3)

Looks like thery're close enough to the same point.看起来他们已经足够接近同一点了。 I don't think geosphere::centroid can give a centroid for a single point, but may be wrong.我不认为geosphere::centroid可以为单个点提供质心,但可能是错误的。 sf::st_centroid has no problem with 1,2, or more points. sf::st_centroid对于 1,2 或更多点没有问题。 Created on 2020-12-20 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2020 年 12 月 20 日创建

You could calculate the means grouped by site names using ave after stripping off the site numbers using gsub .在使用gsub剥离站点编号后,您可以使用ave计算按站点名称分组的平均值。

within(dat, {
  g <- gsub("\\d", "", site)
  mid.lat <- ave(lat, g)
  mid.long <- ave(long, g)
  rm(g)
})
#    site   lat    long mid.long mid.lat
# 1 bras2 41.21 -115.11 -115.150  41.160
# 2  tex4 45.30 -112.31 -112.310  45.300
# 3 bras2 41.15 -115.15 -115.150  41.160
# 4 bras2 41.12 -115.19 -115.150  41.160
# 5  foo1 42.10 -123.10 -123.225  42.225
# 6  foo2 42.20 -123.20 -123.225  42.225
# 7 foo11 42.30 -123.30 -123.225  42.225
# 8 foo12 42.30 -123.30 -123.225  42.225

Or, if you depend on the NA :或者,如果您依赖NA

within(dat, {
  g <- gsub("\\d", "", site)
  n <- ave(site, g, FUN=length)
  mid.lat <- NA
  mid.long <- NA
  mid.lat[n > 1] <- ave(lat[n > 1], g[n > 1])
  mid.long[n > 1] <- ave(long[n > 1], g[n > 1])
  rm(g, n)
  })
#    site   lat    long mid.long mid.lat
# 1 bras2 41.21 -115.11 -115.150  41.160
# 2  tex4 45.30 -112.31       NA      NA
# 3 bras2 41.15 -115.15 -115.150  41.160
# 4 bras2 41.12 -115.19 -115.150  41.160
# 5  foo1 42.10 -123.10 -123.225  42.225
# 6  foo2 42.20 -123.20 -123.225  42.225
# 7 foo11 42.30 -123.30 -123.225  42.225
# 8 foo12 42.30 -123.30 -123.225  42.225

Data:数据:

dat <- structure(list(site = c("bras2", "tex4", "bras2", "bras2", "foo1", 
"foo2", "foo11", "foo12"), lat = c(41.21, 45.3, 41.15, 41.12, 
42.1, 42.2, 42.3, 42.3), long = c(-115.11, -112.31, -115.15, 
-115.19, -123.1, -123.2, -123.3, -123.3)), class = "data.frame", row.names = c(NA, 
-8L))

The geosphere package has a function centroid to solve problems such as this one.地圈 package 有一个 function centroid来解决这样的问题。
It as long as there more than one point in shape it is straight forward.只要在形状上有不止一个点,它就是笔直的。 Most of the code below involves handling the single point case in the example above.下面的大部分代码都涉及处理上面示例中的单点情况。

df <- read.table(header=TRUE, text= "site   lat      long 
bras2  41.21   -115.11
tex4   45.3    -112.31
bras2  41.15   -115.15 
bras2  41.12   -115.19")


library(dplyr)
library(geosphere)

df %>% group_by(side) %>% centroid(.[ ,c(3,2)])

sites <- split(df, df$site)
results <-lapply(sites, function(x) {
   if(nrow(x)>1 ) {
     value <- as.data.frame(centroid(x[, c(3,2)]))
   }
   else {
      value <- x[1, c(3,2)]
      names(value) <- c("lon", "lat")
   }
   value$site <- x$site[1]
   value
})

answer<-bind_rows(results)

      lon      lat  site
1 -115.15 41.16001 bras2
2 -112.31 45.30000  tex4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM