简体   繁体   中英

Summarize occurrences by area and then by custom groups

I have below dataset that takes a 2 column dataset and creates age group categories depending on stated CustomerAge.

    library(tidyverse)
    
    df <- 
      read.table(textConnection("Area   CustomerAge
    A 28 
    A 40
    A 70
    A 19
    B 13
    B 12
    B 72
    B 90"), header=TRUE)
    

df2 <- df %>% 
  mutate(
    # Create categories
    Customer_Age_Group = dplyr::case_when(
      CustomerAge <= 18            ~ "0-18",
      CustomerAge > 18 & CustomerAge <= 60 ~ "19-60",
      CustomerAge > 60             ~ ">60"
    ))

What I am looking to achieve is an output summary that looks like the below:

Area Customer_Age_Group Occurrences
A 0-18 0
A 19-59 3
A >60 1
B 0-18 2
B 19-59 0
B >60 2

To include also 0 occurences you need count() , ungroup() and complete() :

df2 %>% group_by(Area, Customer_Age_Group,.drop = FALSE) %>% 
count() %>% 
ungroup() %>% 
complete(Area, Customer_Age_Group, fill=list(n=0))

This will show also 0 occurences.

To sort for Area and Age group:

df2 %>% group_by(Area, Customer_Age_Group,.drop = FALSE) %>% 
count() %>% 
ungroup() %>% 
complete(Area, Customer_Age_Group, fill=list(n=0)) %>% 
arrange(Area, parse_number(Customer_Age_Group))

group_by and summarise is what you're looking for.

df2 %>% group_by(Area, Customer_Age_Group) %>% summarise(Occurences = n())

However note that this won't show categories with zero occurences in your data set.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM