[英]Pivot_table summation: KeyError('%s not in index' % objarr[mask])
I have a table like: 我有一张像这样的桌子:
Category Customer Month Year Unit Unit Symbol Value
0 AF Brand1 1 2017 Gross Sales $ 1
1 AF Brand1 1 2017 Sales quantity EAU 1
2 AF Brand1 2 2017 Gross Sales $ 1
3 AF Brand1 2 2017 Sales quantity EAU 1
4 AF Brand1 3 2017 Gross Sales $ 1
5 AF Brand1 3 2017 Sales quantity EAU 1
6 AF Brand1 4 2017 Gross Sales $ 1
7 AF Brand1 4 2017 Sales quantity EAU 1
8 AF Brand1 5 2017 Gross Sales $ 1
9 AF Brand1 5 2017 Sales quantity EAU 1
10 AF Brand2 1 2017 Gross Sales $ 1
11 AF Brand2 1 2017 Sales quantity EAU 1
12 AF Brand2 2 2017 Gross Sales $ 1
13 AF Brand2 2 2017 Sales quantity EAU 1
14 AF Brand2 3 2017 Gross Sales $ 1
15 AF Brand2 3 2017 Sales quantity EAU 1
16 AF Brand2 4 2017 Gross Sales $ 1
17 AF Brand2 4 2017 Sales quantity EAU 1
18 AF Brand2 5 2017 Gross Sales $ 1
19 AF Brand2 5 2017 Sales quantity EAU 1
Which I have already loaded into memory 我已经加载到内存中
I want to remove the Customer column, and aggregate the Value column forall records where other column values are the same. 我想删除“客户”列,并汇总其他列值相同的所有记录的“值”列。
EG: For all records where Category, Month, Year, Unit and Symbol are the same, I want the Value field to be summated into a new frame as shown below: EG:对于类别,月份,年份,单位和符号都相同的所有记录,我希望将“值”字段汇总到一个新框架中,如下所示:
Category Month Year Unit Unit Symbol Value
0 AF 1 2017 Gross Sales $ 2
1 AF 1 2017 Sales quantity EAU 2
2 AF 2 2017 Gross Sales $ 2
3 AF 2 2017 Sales quantity EAU 2
4 AF 3 2017 Gross Sales $ 2
5 AF 3 2017 Sales quantity EAU 2
6 AF 4 2017 Gross Sales $ 2
7 AF 4 2017 Sales quantity EAU 2
8 AF 5 2017 Gross Sales $ 2
9 AF 5 2017 Sales quantity EAU 2
I have tried different variations on: 我在以下方面尝试了不同的变化:
df.pivot_table(columns=['Unit', 'Unit Symbol', 'month', 'year'], index='Category', aggfunc=sum, values="Value")
but it always returns an error, like KeyError('%s not in index' % objarr[mask])
Followed by a list of my Customers. 但它总是返回错误,例如
KeyError('%s not in index' % objarr[mask])
后跟我的客户列表。 This doesn't make sense to me as I am pivoting my data to get rid of my customers and aggregate. 这对我来说没有任何意义,因为我正在调整数据以摆脱客户并进行汇总。
I have 12 different customers and 13 different Categories. 我有12个不同的客户和13个不同的类别。 Not all customers feature all categories and vice versa.
并非所有客户都具有所有类别,反之亦然。 Their associations change over time, so hard coding this is not practical.
它们的关联会随着时间而变化,因此很难进行硬编码。
How can I summate my table in this fashion? 如何以这种方式汇总我的桌子?
df.pivot_table(index=['Category','Month','Year','Unit','Unit Symbol'],values="Value",aggfunc=np.sum).reset_index().assign(Customer='Total')
Output: 输出:
Category Month Year Unit Unit Symbol Value Customer
0 AF 1 2017 Gross Sales $ 2 Total
1 AF 1 2017 Sales quantity EAU 2 Total
2 AF 2 2017 Gross Sales $ 2 Total
3 AF 2 2017 Sales quantity EAU 2 Total
4 AF 3 2017 Gross Sales $ 2 Total
5 AF 3 2017 Sales quantity EAU 2 Total
6 AF 4 2017 Gross Sales $ 2 Total
7 AF 4 2017 Sales quantity EAU 2 Total
8 AF 5 2017 Gross Sales $ 2 Total
9 AF 5 2017 Sales quantity EAU 2 Total
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.