I have a.csv of species occurrences with individual lat-long points, but I am trying to aggregate them all into a singular 'average' coordinate point per species. From some digging I see there can be issues with a simple average (high altitude or low altitude outliers can cause issues). Does anyone have a suggestion on how to do this easily/quickly in R? Thanks
Data is seen up like this but I have 71,000+ occurrences total.
species | longitude | latitude |
---|---|---|
Abies amabilis | -111.112964 | 41.199112 |
Abies arizonica | -110.8678 | 37.0349 |
Abies bifolia | -111.650833 | 41.82 |
Abies bifolia | -113.377722 | 41.950833 |
Using
your_data <- tibble::tribble(~species, ~longitude, ~latitude, "Abies amabilis", -111.112964, 41.199112, "Abies arizonica", -110.8678, 37.0349, "Abies bifolia", -111.650833, 41.82, "Abies bifolia", -113.377722, 41.950833)
you could do
aggregate(. ~ species, your_data, mean)
to calculate averages. This returns:
species longitude latitude
1 Abies amabilis -111.1130 41.19911
2 Abies arizonica -110.8678 37.03490
3 Abies bifolia -112.5143 41.88542
Alternatively, dplyr
could be used to do
library(dplyr)
your_data %>%
group_by(species) %>%
summarize(across(.fns = mean))
which similarly returns
# A tibble: 3 x 3
species longitude latitude
<chr> <dbl> <dbl>
1 Abies amabilis -111. 41.2
2 Abies arizonica -111. 37.0
3 Abies bifolia -113. 41.9
Using data.table
library(data.table)
setDT(df1)[, lapply(.SD, mean), species]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.