I have a dataset with district name, household latitude, and longitude. The dataset has 2000 household locations . I want to calculate the mean of latitude and longitude based on district name. Next, I want to add two new columns (ie Lat_mean, Long_mean) in which the mean Lat and Long will be stored for each household.
I was just able to aggregate the mean values for latitude and longitude. I don't know how to paste the summarized data as a new column for each ID (see code)
id <- c(1,2,3,4,5,6)
district <- c("A", "B", "C", "A", "A", "B")
lat <- c(28.6, 30.2, 35.9, 27.5, 27.9, 31.5)
long <- c(77.5, 85.2, 66.5, 75.0, 79.2, 88.8)
df <- data.frame(id, district, lat, long)
df_group <- df %>% group_by(district) %>% summarise_at(vars(lat:long), mean)
I am expecting the following. Lat_mean & Long_mean columns will be added to 'df' and each ID will have values based on district name. See the image below.
We can use mutate_at
instead of summarise_at
. Within the list
, specify the name
, so that it will create a new column with suffix
as that name
library(dplyr)
df %>%
group_by(district) %>%
mutate_at(vars(lat, long), list(mean = mean))
# A tibble: 6 x 6
# Groups: district [3]
# id district lat long lat_mean long_mean
# <dbl> <fct> <dbl> <dbl> <dbl> <dbl>
#1 1 A 28.6 77.5 28 77.2
#2 2 B 30.2 85.2 30.8 87
#3 3 C 35.9 66.5 35.9 66.5
#4 4 A 27.5 75 28 77.2
#5 5 A 27.9 79.2 28 77.2
#6 6 B 31.5 88.8 30.8 87
> df %>%
mutate(lat_mean = ave(lat, district, FUN=mean),
lon_mean = ave(long, district, FUN=mean))
id district lat long lat_mean lon_mean
1 1 A 28.6 77.5 28.00 77.23333
2 2 B 30.2 85.2 30.85 87.00000
3 3 C 35.9 66.5 35.90 66.50000
4 4 A 27.5 75.0 28.00 77.23333
5 5 A 27.9 79.2 28.00 77.23333
6 6 B 31.5 88.8 30.85 87.00000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.