简体   繁体   中英

Aggregate over combinations of columns with data.table

Suppose I have this data

> dput(data)
structure(list(Country = c("USA", "USA", "USA", "USA", "USA", 
"USA", "USA", "USA", "USA"), Location = c("West", "East", "East", 
"North", "North", "East", "West", "North", "East"), Gender = c("M", 
"M", "F", "F", "F", "F", "F", "F", "M"), Age = c("20 - 30", "30 - 40", 
"20 - 30", "30 - 40", "20 - 30", "20 - 30", "30 - 40", "20 - 30", 
"30 - 40"), Civil_Status = c("Single", "Single", "Married", "Married", 
"Married", "Single", "Single", "Married", "Married"), Expenditure = c(320, 
400, 800, 900, 750, 350, 620, 1200, 800)), row.names = c(NA, 
-9L), class = c("tbl_df", "tbl", "data.frame"))


  Country Location Gender Age     Civil_Status Expenditure
  <chr>   <chr>    <chr>  <chr>   <chr>              <dbl>
1 USA     West     M      20 - 30 Single               320
2 USA     East     M      30 - 40 Single               400
3 USA     East     F      20 - 30 Married              800
4 USA     North    F      30 - 40 Married              900
5 USA     North    F      20 - 30 Married              750
6 USA     East     F      20 - 30 Single               350
7 USA     West     F      30 - 40 Single               620
8 USA     North    F      20 - 30 Married             1200
9 USA     East     M      30 - 40 Married              800

What I'm trying to do is sum the expenditure over all the combinations of the variables gender, age, civil_status, first by Country and later for all the possible locations and then merge all this combinations of results into a single dataset.

Here's an example

Usa
USA, Gender
USA, Age
USA, Civil_Status
USA, Gender, Age
USA, Gender, Civil_Status
.....................
West, Gender
West, Age
.....................

In this case I will have 2^3=8 combinations for Country and 8 combinations for each one of the locations.

One option is rollup from data.table

library(data.table)
setDT(data)
rollup(data, j = sum(Expenditure), by = c("Country","Gender","Age", "Civil_Status"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM