简体   繁体   English

以长数据格式创建聚合变量

[英]Create aggregate variable in long data format

I'm sure there's a question similar to this already, but I couldn't make them work我确定已经有一个与此类似的问题,但我无法使它们起作用

I am trying to calculate aggregates (or subtotals) in a dataframe of long format.我正在尝试以长格式的数据帧计算聚合(或小计)。 In the group column I want an aggregate variable "AGG" that is a sum of "value" for a specific "Year" and "var".在组列中,我想要一个聚合变量“AGG”,它是特定“年”和“var”的“值”之和。 I have tried using the aggregate() function, but didn't succeed.我曾尝试使用aggregate() 函数,但没有成功。 I used the code:我使用了代码:

aggregate(value ~ cbind(Year,var), data = Energi5, FUN = sum)

My data looks like this我的数据看起来像这样

> head(df)
     Year group  var     value
1    1966       A x   25465462
2    1966       B x    9512621
3    1966       E x    2832865
4    1966       H x     291769
5    1966      NE x  141524912
6    1966      NF x   23580353
> tail(df)
     Year group   var  value
5403 2017     NZ y    167158
5404 2017      O y     23480
5405 2017     QF y         0
5406 2017     QS y         0
5407 2017     QZ y     16447
5408 2017 TC3000 y    488556

and I would like to obtain something like this at the end of (or in the middle of) my existing dataframe我想在我现有的数据框的末尾(或中间)获得这样的东西

     Year group   var  value
5409 1966   AGG   x        ?
5410 1967   AGG   x        ?
...
5450 2017   AGG   x        ?
5451 1966   AGG   y        ?
...

I hope you can help.我希望你能帮忙。 Thank you!谢谢!

The error lies in how are you declaring the formula.错误在于您如何声明公式。 See ?formula in the manual.参见手册中的?formula

# Example
year <- rep(seq(1966, 2020), each = 8)
group <- rep(letters[1:4], times = 2*(2021-1966))
var <- rep(c("x", "y"), times = length(year)/2)
value <- rnorm(length(year))

data <- cbind.data.frame(year, group, var, value)

# Solution
aggregate(value ~ year * var, data, FUN=sum)

There is probably a more efficient way to do this, but does this help?可能有一种更有效的方法来做到这一点,但这有帮助吗?

library(dplyr)

df <- Energi5 %>%  group_by(Year, var) %>% mutate(value = sum(value)) %>% summarise_all(funs(mean))

df$group <- "AGG"

Energi5 <- merge(Energi5, df, all = T)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从融合数据到长格式的Dcast聚合 - Dcast aggregate from melted data to long format 当数据为长格式时,如何在 R 中创建具有条件的新变量? - How to create a new variable with condition in R when data is long format? 使用 tidyr 中的 pivot_longer 创建一个长格式数据,其中一个变量嵌套在另一个变量中 - Using pivot_longer from tidyr to create a long format data with one variable nested in another variable 使用两个变量将数据更改为长格式 - changing data into long format using two variable 按变量后缀分组将数据集从宽格式转换为长格式 - Reshape data set from wide to long format grouped by variable suffix 在 R 中以长数据格式获取变量中的最后日期 - Get last date in variable in a long data format in R 将HRS数据从宽格式转换为长格式并创建时间变量 - Reshaping the HRS data from wide to long format and creating a time variable 基于 R 中的时间点的长格式数据中的新变量 - New variable in long format data based on timepoint in R 通过变量将数据从长格式重塑为短格式,并重命名列 - Reshape data from long to a short format by a variable, and rename columns 根据变量名的第一个字母将数据整形为长格式 - Reshape data to long format based on the first letter of the variable names
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM