简体   繁体   中英

R: Multiple levels of grouping in dplyr::summarise using lapply

I'm relatively new to R and I seem to be having trouble applying a list criteria to a data.frame I'm trying to summarize. I've been reading a bunch of different posts, but they only seem to be concerned with one level of grouping, and not a second one.

Assuming my df looks like this (my actual data frame is much larger. There are 35 different "Codes" and about 20 different "Colors")

    Code   Color   Value
[1] A      Red     10
[2] A      Blue    15
[3] A      Red     5
[4] B      Green   20
[5] B      Red     15 
[6] C      Green   10

Ideally, I'd like to create a summary table which enables me to group the data by Code (I've been successful doing this with Group by and Split) but then i'd also like to create a sum of values by criteria "Color". Currently, I've only been able to accomplish this by running the criteria one by one.

So far I've been able to do this:

#this gives me the total value by each code, like a pivot or a sumif
dfsummary <-df %>% group_by(Code) %>% summarise (total = sum(Value))

#then I was able to come up with this to give me, by Code, value by Color.
dfsummary2 <- df %>% filter(Color == "Red") %>% group_by(Code) 
%>% summarise(sumRed = sum(Value))

The results in dfsummary2 are:

   Code   sumRed   
[1] A      15     
[2] B      15    
[3] C      0

What I'd like to accomplish is creating a data frame for all "Color" without having to specify each one individually.

My desired output, let's call it dfsummaryall, looks like:

    Code   sumRed   sumBlue  sumGreen
[1] A      15       15       0
[2] B      15       0        20
[3] C      0        0        10

This is where I get stumped. I can run each one individually and then merge them into one table, but I'd like to find a way to work in an apply function (lapply, I would think). This is where I'm definitely a novice.

My attempt so far, and this is where I'm sure I'm egregiously wrong, goes like this:

colors <- c("Red","Blue","Green")

dfsummaryall <- lapply(colors, function(x){dftmp <- df %>%
dplyr::filter(Color == x) %>% group_by(Code) %>% 
summarise(x == sum(MktValue)

I know there's definitely a problem here in the "summarise(x == sum(MktValue)" part, but I'm really stumped as to how to pull this off.

Any help would be truly appreciated!

From user duckmayr in the comments:

df %>% group_by(Code, Color) %>% summarise(Sum = sum(Value)) %>% tidyr::spread(Color, Sum, fill = 0)

This worked perfectly for my purposes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM