group_by and summarize() multiple things in R using dplyr/tidyverse

Question

I am trying to find the country with the highest average age but I also need to filter out countries with less than 5 entries in the data frame. I tried the following but it does not work:

bil %>% 
  group_by(citizenship,age) %>% 
  mutate(n=count(citizenship), theMean=mean(age,na.rm=T)) %>% 
  filter(n>=5) %>% 
  arrange(desc(theMean))

bil is the dataset and I am trying to count how many entries I have for each country, filter out countries with less than 5 entries, find the average age for each country and then find the country with the highest average. I am confused on how to do both things at the same time. If I do one summarize at a time I lose the rest of my data.

Answer 1

Perhaps, this could help. Note that the parameter 'x' in count is a tbl/data.frame . So, instead of count , we group by 'citizenship' and get the frequency of values with n() , get the mean of 'age' (not sure about the 'age' as grouping variable) and do the filter

bil %>%
   group_by(citizenship) %>% 
   mutate(n = n()) %>%     
   mutate(theMean = mean(age, na.rm=TRUE)) %>% 
   filter(n>=5) %>%
   arrange(desc(theMean))

group_by and summarize() multiple things in R using dplyr/tidyverse

Question

1 answers

solution1
2 2018-09-29 18:45:58

group_by and summarize() multiple things in R using dplyr/tidyverse

Question

1 answers

solution1 2 2018-09-29 18:45:58

solution1
2 2018-09-29 18:45:58