I'm relatively new to R and I seem to be having trouble applying a list criteria to a data.frame I'm trying to summarize. I've been reading a bunch of different posts, but they only seem to be concerned with one level of grouping, and not a second one.
Assuming my df looks like this (my actual data frame is much larger. There are 35 different "Codes" and about 20 different "Colors")
Code Color Value
[1] A Red 10
[2] A Blue 15
[3] A Red 5
[4] B Green 20
[5] B Red 15
[6] C Green 10
Ideally, I'd like to create a summary table which enables me to group the data by Code (I've been successful doing this with Group by and Split) but then i'd also like to create a sum of values by criteria "Color". Currently, I've only been able to accomplish this by running the criteria one by one.
So far I've been able to do this:
#this gives me the total value by each code, like a pivot or a sumif
dfsummary <-df %>% group_by(Code) %>% summarise (total = sum(Value))
#then I was able to come up with this to give me, by Code, value by Color.
dfsummary2 <- df %>% filter(Color == "Red") %>% group_by(Code)
%>% summarise(sumRed = sum(Value))
The results in dfsummary2 are:
Code sumRed
[1] A 15
[2] B 15
[3] C 0
What I'd like to accomplish is creating a data frame for all "Color" without having to specify each one individually.
My desired output, let's call it dfsummaryall, looks like:
Code sumRed sumBlue sumGreen
[1] A 15 15 0
[2] B 15 0 20
[3] C 0 0 10
This is where I get stumped. I can run each one individually and then merge them into one table, but I'd like to find a way to work in an apply function (lapply, I would think). This is where I'm definitely a novice.
My attempt so far, and this is where I'm sure I'm egregiously wrong, goes like this:
colors <- c("Red","Blue","Green")
dfsummaryall <- lapply(colors, function(x){dftmp <- df %>%
dplyr::filter(Color == x) %>% group_by(Code) %>%
summarise(x == sum(MktValue)
I know there's definitely a problem here in the "summarise(x == sum(MktValue)" part, but I'm really stumped as to how to pull this off.
Any help would be truly appreciated!
From user duckmayr in the comments:
df %>% group_by(Code, Color) %>% summarise(Sum = sum(Value)) %>% tidyr::spread(Color, Sum, fill = 0)
This worked perfectly for my purposes.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.