I have a dataframe:
I want to group by "ID" and "direction", then get the statistics for "value". The hardest thing for me is that for "category" column, I need to always output the last "category" in "ID" group, as highlighted on the picture.
I have the code, but the result is not desirable. Can anyone please help me to modify the existing code? Thank you for your time!
ID <- c(1,1,1,2,2,2,3,3)
category <- c("green", "green", "red", "red","green", "green", "yellow", "yellow")
direction <- c("in", "out","in", "out","in", "out","in", "out")
value <- c(4,5,6,7,8,9,10,11)
df <- data.frame(ID, category, direction, value)
res <- df %>%
group_by(ID,direction) %>%
arrange(ID, direction)%>%
summarize(
category = last(category),
sum_value = sum(value),
count_value = length(value)
)
You're almost there. It's just that your "last(category)" grouping is based only on ID rather than both ID and direction. If you change it to:
res <- df %>%
group_by(ID) %>%
mutate(category = last(category)) %>%
ungroup %>%
group_by(ID, direction, category) %>%
summarise(
sum_value = sum(value),
count_value = length(value)
) %>%
ungroup
It should do the trick.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.