[英]How to sum values with multiple conditions per year in R
I have count data from different regions per year.我每年都有来自不同地区的统计数据。 The original data is structured like this:原始数据结构如下:
count region year
1 1 A 2011
2 2 A 2010
3 1 A 2009
4 5 A 2008
5 4 A 2007
6 2 B 2011
7 2 B 2010
8 1 B 2009
9 5 B 2008
10 3 B 2007
11 3 C 2011
12 3 C 2010
13 2 C 2009
14 1 C 2008
15 3 C 2007
16 4 D 2011
17 3 D 2010
18 2 D 2009
19 1 D 2008
20 4 D 2007
I now need to combine (sum) the values only for region A and D per year and keep the value A for the column regions of these calculated sums.我现在需要合并(求和)每年仅针对区域 A 和 D 的值,并保留这些计算总和的列区域的值 A。 The output should look like this: output 应如下所示:
count region year
1 5 A 2011
2 5 A 2010
3 3 A 2009
4 6 A 2008
5 8 A 2007
6 2 B 2011
7 2 B 2010
8 1 B 2009
9 5 B 2008
10 3 B 2007
11 3 C 2011
12 3 C 2010
13 2 C 2009
14 1 C 2008
15 3 C 2007
The counts for region B and C should not be changed.区域 B 和 C 的计数不应更改。 I tried but never received the needed output. Does anyone have a tip?我试过了,但从未收到所需的 output。有人有小费吗? I would be very grateful.我会很感激。
We may replace
the D
to A
, and do a group_by
sum
我们可以将D
replace
为A
,然后进行group_by
sum
library(dplyr)
df1 %>%
group_by(region = replace(region, region == 'D', 'A'), year) %>%
summarise(count = sum(count), .groups = 'drop')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.