[英]Calculate a centre point of multiple lat, long points in a data-frame
I have a dataset that looks like this:我有一个如下所示的数据集:
site lat long
bras2 41.21 -115.11
tex4 45.3 -112.31
bras2 41.15 -115.15
bras2 41.12 -115.19
For samples with the same site
name, I want to calculate their centre point and then add it as a column to the dataset.对于具有相同
site
名称的样本,我想计算它们的中心点,然后将其作为列添加到数据集中。 Some site
names are duplicated twice, other three times, other four times.一些
site
名称重复两次,其他三次,其他四次。
Like this:像这样:
site lat long centre_lat centre_long
bras2 41.21 -115.11 value here value here
tex4 45.3 -112.31 45.3 -112.31
bras2 41.15 -115.15 value here value here
bras2 41.12 -115.19 value here value here
How can I do this?我怎样才能做到这一点?
If you're using spatial data, you should look into using the sf
package.如果您使用的是空间数据,您应该考虑使用
sf
package。 It handles geometries and functions for operating on them well.它可以很好地处理几何图形和函数。
Code below shows using both sf::st_centroid
and geosphere::centroid
.下面的代码显示了同时使用
sf::st_centroid
和 geosphere geosphere::centroid
。 I prefer sf
's way of doing things.我更喜欢
sf
的做事方式。
df <- read.table(header=TRUE, text= "site lat long
bras2 41.21 -115.11
tex4 45.3 -112.31
bras2 41.15 -115.15
bras2 41.12 -115.19")
library(dplyr)
library(geosphere)
library(sf)
# Using sf's st_centroid
df_sf <- st_as_sf(df, coords = c('long', 'lat'))
centroids_sf <- df_sf %>%
group_by(site) %>%
summarize(geometry = st_union(geometry)) %>%
st_centroid
# Using geosphere::centroid
centroids_geoshpere <- df_sf %>%
group_by(site) %>%
filter(n() >2) %>% ## geosphere needs polygons therefore 3+ points
st_union() %>%
st_cast('POLYGON') %>%
as('Spatial') %>% # geoshpere expects SpatialPolygons objects
centroid()
centroids_geoshpere
#> [,1] [,2]
#> [1,] -115.15 41.16001
centroids_sf
#> Simple feature collection with 2 features and 1 field
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: -115.15 ymin: 41.16 xmax: -112.31 ymax: 45.3
#> CRS: NA
#> # A tibble: 2 x 2
#> site geometry
#> * <chr> <POINT>
#> 1 bras2 (-115.15 41.16)
#> 2 tex4 (-112.31 45.3)
Looks like thery're close enough to the same point.看起来他们已经足够接近同一点了。 I don't think
geosphere::centroid
can give a centroid for a single point, but may be wrong.我不认为
geosphere::centroid
可以为单个点提供质心,但可能是错误的。 sf::st_centroid
has no problem with 1,2, or more points. sf::st_centroid
对于 1,2 或更多点没有问题。 Created on 2020-12-20 by the reprex package (v0.3.0)由代表 package (v0.3.0) 于 2020 年 12 月 20 日创建
You could calculate the means grouped by site names using ave
after stripping off the site numbers using gsub
.在使用
gsub
剥离站点编号后,您可以使用ave
计算按站点名称分组的平均值。
within(dat, {
g <- gsub("\\d", "", site)
mid.lat <- ave(lat, g)
mid.long <- ave(long, g)
rm(g)
})
# site lat long mid.long mid.lat
# 1 bras2 41.21 -115.11 -115.150 41.160
# 2 tex4 45.30 -112.31 -112.310 45.300
# 3 bras2 41.15 -115.15 -115.150 41.160
# 4 bras2 41.12 -115.19 -115.150 41.160
# 5 foo1 42.10 -123.10 -123.225 42.225
# 6 foo2 42.20 -123.20 -123.225 42.225
# 7 foo11 42.30 -123.30 -123.225 42.225
# 8 foo12 42.30 -123.30 -123.225 42.225
Or, if you depend on the NA
:或者,如果您依赖
NA
:
within(dat, {
g <- gsub("\\d", "", site)
n <- ave(site, g, FUN=length)
mid.lat <- NA
mid.long <- NA
mid.lat[n > 1] <- ave(lat[n > 1], g[n > 1])
mid.long[n > 1] <- ave(long[n > 1], g[n > 1])
rm(g, n)
})
# site lat long mid.long mid.lat
# 1 bras2 41.21 -115.11 -115.150 41.160
# 2 tex4 45.30 -112.31 NA NA
# 3 bras2 41.15 -115.15 -115.150 41.160
# 4 bras2 41.12 -115.19 -115.150 41.160
# 5 foo1 42.10 -123.10 -123.225 42.225
# 6 foo2 42.20 -123.20 -123.225 42.225
# 7 foo11 42.30 -123.30 -123.225 42.225
# 8 foo12 42.30 -123.30 -123.225 42.225
Data:数据:
dat <- structure(list(site = c("bras2", "tex4", "bras2", "bras2", "foo1",
"foo2", "foo11", "foo12"), lat = c(41.21, 45.3, 41.15, 41.12,
42.1, 42.2, 42.3, 42.3), long = c(-115.11, -112.31, -115.15,
-115.19, -123.1, -123.2, -123.3, -123.3)), class = "data.frame", row.names = c(NA,
-8L))
The geosphere package has a function centroid
to solve problems such as this one.地圈 package 有一个 function
centroid
来解决这样的问题。
It as long as there more than one point in shape it is straight forward.只要在形状上有不止一个点,它就是笔直的。 Most of the code below involves handling the single point case in the example above.
下面的大部分代码都涉及处理上面示例中的单点情况。
df <- read.table(header=TRUE, text= "site lat long
bras2 41.21 -115.11
tex4 45.3 -112.31
bras2 41.15 -115.15
bras2 41.12 -115.19")
library(dplyr)
library(geosphere)
df %>% group_by(side) %>% centroid(.[ ,c(3,2)])
sites <- split(df, df$site)
results <-lapply(sites, function(x) {
if(nrow(x)>1 ) {
value <- as.data.frame(centroid(x[, c(3,2)]))
}
else {
value <- x[1, c(3,2)]
names(value) <- c("lon", "lat")
}
value$site <- x$site[1]
value
})
answer<-bind_rows(results)
lon lat site
1 -115.15 41.16001 bras2
2 -112.31 45.30000 tex4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.