functions on groups within groups R

Question

let's say I have a dataframe df with three columns: revenue (int), quarter (factor with 4 levels), and product (factor with 3 levels).

df <- data.frame(
     revenue = sample(500:5000, 10, replace=TRUE),
     quarter = sample(c("q1", "q2", "q3", "q4"), 50, replace = TRUE),
     product = sample(c("book", "movie", "tv"), 50, replace = TRUE))

It would be very easy to use tapply to group by either quarter or product and perform a variety of functions on revenue, like this:

quarterly_revenue <- tapply(df$revenue, df$quarter, sum)

which gives me the sum of revenue per quarter.

However, this is my question: what if I want it more granular, ie: the sum of each product's revenue per quarter? I've tried the split function to create a list of dataframes and use various plyr solutions, but none give me the output I'm looking for. I know I could subset based on each factor, but that seems inefficient, particularly when the actual set I'm working with has many more factor levels.

any ideas? thanks for the help!

Answer 1

We place the grouping columns in a list and get the sum

tapply(df$revenue, list(df$quarter, df$product),  sum)

It would be much easier with aggregate

aggregate(revenue~., df, sum)

or dplyr or data.table

library(dplyr)
df %>% 
    group_by(quarter, product) %>%
    summarise(Sum = sum(revenue))

Answer 2

You can use data.table with a by parameter:

library( data.table )
setDT( df )[ , quarterly_revenue := sum( revenue ), 
               by = .( quarter, product ) ]

Or, to summarise (instead of just adding a column):

library( data.table )
library( magrittr )

setDT( df )[ , sum( revenue ), 
               by = .( quarter, product ) ] %>%
    setnames( c( "quarter", "product", "quarterly_revenue" ) )

functions on groups within groups R

Question

2 answers

solution1
0 2016-11-01 00:26:50

solution2
0 2016-11-01 00:31:29

functions on groups within groups R

Question

2 answers

solution1 0 2016-11-01 00:26:50

solution2 0 2016-11-01 00:31:29

solution1
0 2016-11-01 00:26:50

solution2
0 2016-11-01 00:31:29