[英]R Aggregate data frame based on column values
我有一個看起來像這樣的數據集:
> newex
Name Volume Period
1 oil 29000 Jun 21
2 gold 800 Mar 22
3 oil 21000 Jul 21
4 gold 1100 Sep 21
5 gold 3000 Feb 21
6 depower 3 Q1 21
7 oil 23000 Apr 22
8 czpower 26 Q1 23
9 oil 17000 Q1 21
10 gold 2400 May 21
11 oil 12000 Q2 21
12 gold 1800 Jan 22
13 czpower 21 Oct 21
14 api2coal 6000 Q1 22
15 api2coal 11000 Q1 21
16 depower 11 Jan 22
17 api2coal 16000 Jul 21
18 gold 1300 Mar 21
19 depower 3 Q1 22
20 oil 17000 Cal 21
我想重塑數據集以獲得具有以下特征的數據框:
Name
中的值將成為新的變量(列);Period
中的值將成為索引(應該是唯一的);Volume
中的值是Name
和Period
的每個組合的值的總和。有人可以給我一個關於如何實現這一目標的提示嗎? 先感謝您。
我們可以使用pivot_wider
:
library(dplyr)
library(tidyr)
newex %>%
pivot_wider(
names_from = Name,
values_from = Volume
)
Period oil gold depower czpower api2coal
<chr> <int> <int> <int> <int> <int>
1 Jun 21 29000 NA NA NA NA
2 Mar 22 NA 800 NA NA NA
3 Jul 21 21000 NA NA NA 16000
4 Sep 21 NA 1100 NA NA NA
5 Feb 21 NA 3000 NA NA NA
6 Q1 21 17000 NA 3 NA 11000
7 Apr 22 23000 NA NA NA NA
8 Q1 23 NA NA NA 26 NA
9 May 21 NA 2400 NA NA NA
10 Q2 21 12000 NA NA NA NA
11 Jan 22 NA 1800 11 NA NA
12 Oct 21 NA NA NA 21 NA
13 Q1 22 NA NA 3 NA 6000
14 Mar 21 NA 1300 NA NA NA
15 Cal 21 17000 NA NA NA NA
數據:
newex <- structure(list(Name = c("oil", "gold", "oil", "gold", "gold",
"depower", "oil", "czpower", "oil", "gold", "oil", "gold", "czpower",
"api2coal", "api2coal", "depower", "api2coal", "gold", "depower",
"oil"), Volume = c(29000L, 800L, 21000L, 1100L, 3000L, 3L, 23000L,
26L, 17000L, 2400L, 12000L, 1800L, 21L, 6000L, 11000L, 11L, 16000L,
1300L, 3L, 17000L), Period = c("Jun 21", "Mar 22", "Jul 21",
"Sep 21", "Feb 21", "Q1 21", "Apr 22", "Q1 23", "Q1 21", "May 21",
"Q2 21", "Jan 22", "Oct 21", "Q1 22", "Q1 21", "Jan 22", "Jul 21",
"Mar 21", "Q1 22", "Cal 21")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"))
library(data.table)
dcast(setDT(newex),Period~Name, value.var="Volume",fun.aggregate = sum)
Output:
Period api2coal czpower depower gold oil
1: Apr 22 0 0 0 0 23000
2: Cal 21 0 0 0 0 17000
3: Feb 21 0 0 0 3000 0
4: Jan 22 0 0 11 1800 0
5: Jul 21 16000 0 0 0 21000
6: Jun 21 0 0 0 0 29000
7: Mar 21 0 0 0 1300 0
8: Mar 22 0 0 0 800 0
9: May 21 0 0 0 2400 0
10: Oct 21 0 21 0 0 0
11: Q1 21 11000 0 3 0 17000
12: Q1 22 6000 0 3 0 0
13: Q1 23 0 26 0 0 0
14: Q2 21 0 0 0 0 12000
15: Sep 21 0 0 0 1100 0
我在嘗試之前的答案后找到了答案:
newex %>%
pivot_wider(
names_from = Name,
values_from = Volume, values_fn = sum)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.