I have a dataframe named Cust_Amount
which is as follows:
Age Amount_Spent
25 20
43 15
32 27
37 10
45 17
29 10
I want to break it down into equal sized age groups and sum the amount spent for each age groups as given below:
Age_Group Total_Amount
20-30 30
30-40 37
40-50 32
We can use cut
to group the 'Age' and get the sum
of 'Amount_Spent' based on the grouping variable.
library(data.table)
setDT(df1)[,.(Total_Amount = sum(Amount_Spent)) ,
by = .(Age_Group = cut(Age, breaks = c(20, 30, 40, 50)))]
Or with dplyr
library(dplyr)
df1 %>%
group_by(Age_Group = cut(Age, breaks = c(20, 30, 40, 50))) %>%
summarise(Total_Amount = sum(Amount_Spent))
# Age_Group Total_Amount
# <fctr> <int>
#1 (20,30] 30
#2 (30,40] 37
#3 (40,50] 32
Here's a base solution using cut
and aggregate
, and then using setNames
to name the resulting columns:
mydf$Age_Group <- cut(mydf$Age, breaks = seq(20,50, by = 10))
with(mydf, setNames(aggregate(Amount_Spent ~ Age_Group, FUN = sum),
c('Age_Group', 'Total_Spent')))
Age_Group Total_Spent
1 (20,30] 30
2 (30,40] 37
3 (40,50] 32
We can take it a step further using gsub
to match your desired output (note that I'm no regular expression expert):
mydf$Age_Group <-
gsub(pattern = ',',
x = gsub(pattern = ']',
x = gsub(pattern = '(', x = mydf$Age_Group, replacement = '', fixed = T),
replacement = '', fixed = T),
replacement = ' - ', fixed = T)
with(mydf, setNames(aggregate(Amount_Spent ~ Age_Group, FUN = sum),
c('Age_Group', 'Total_Spent')))
Age_Group Total_Spent
1 20 - 30 30
2 30 - 40 37
3 40 - 50 32
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.