[英]Create aggregate variable in long data format
I'm sure there's a question similar to this already, but I couldn't make them work我确定已经有一个与此类似的问题,但我无法使它们起作用
I am trying to calculate aggregates (or subtotals) in a dataframe of long format.我正在尝试以长格式的数据帧计算聚合(或小计)。 In the group column I want an aggregate variable "AGG" that is a sum of "value" for a specific "Year" and "var".在组列中,我想要一个聚合变量“AGG”,它是特定“年”和“var”的“值”之和。 I have tried using the aggregate() function, but didn't succeed.我曾尝试使用aggregate() 函数,但没有成功。 I used the code:我使用了代码:
aggregate(value ~ cbind(Year,var), data = Energi5, FUN = sum)
My data looks like this我的数据看起来像这样
> head(df)
Year group var value
1 1966 A x 25465462
2 1966 B x 9512621
3 1966 E x 2832865
4 1966 H x 291769
5 1966 NE x 141524912
6 1966 NF x 23580353
> tail(df)
Year group var value
5403 2017 NZ y 167158
5404 2017 O y 23480
5405 2017 QF y 0
5406 2017 QS y 0
5407 2017 QZ y 16447
5408 2017 TC3000 y 488556
and I would like to obtain something like this at the end of (or in the middle of) my existing dataframe我想在我现有的数据框的末尾(或中间)获得这样的东西
Year group var value
5409 1966 AGG x ?
5410 1967 AGG x ?
...
5450 2017 AGG x ?
5451 1966 AGG y ?
...
I hope you can help.我希望你能帮忙。 Thank you!谢谢!
The error lies in how are you declaring the formula.错误在于您如何声明公式。 See ?formula
in the manual.参见手册中的?formula
。
# Example
year <- rep(seq(1966, 2020), each = 8)
group <- rep(letters[1:4], times = 2*(2021-1966))
var <- rep(c("x", "y"), times = length(year)/2)
value <- rnorm(length(year))
data <- cbind.data.frame(year, group, var, value)
# Solution
aggregate(value ~ year * var, data, FUN=sum)
There is probably a more efficient way to do this, but does this help?可能有一种更有效的方法来做到这一点,但这有帮助吗?
library(dplyr)
df <- Energi5 %>% group_by(Year, var) %>% mutate(value = sum(value)) %>% summarise_all(funs(mean))
df$group <- "AGG"
Energi5 <- merge(Energi5, df, all = T)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.