[英]Add columns to pivot table with pandas
我的桌子如下:
import pandas as pd
import numpy as np
#simple table
fazenda = [6010,6010,6010,6010]
quadra = [1,1,2,2]
talhao = [1,2,3,4]
arTotal = [32.12,33.13,34.14,35.15]
arCarr = [i/2 for i in arTotal]
arProd = [i/2 for i in arTotal]
varCan = ['RB1','RB2','RB3','RB4']
data = list(zip(fazenda,quadra,talhao,arTotal,arCarr,arProd,varCan))
#Pandas DataFrame
df = pd.DataFrame(data=data,columns=['Fazenda','Quadra','Talhao','ArTotal','ArCarr','ArProd','Variedade'])
#Pivot Table
table = pd.pivot_table(df, values=['ArTotal','ArCarr','ArProd'],index=['Quadra','Talhao'], fill_value=0)
print(table)
结果是:
ArCarr ArProd ArTotal
Quadra Talhao
1 1 16.060 16.060 32.12
2 16.565 16.565 33.13
2 3 17.070 17.070 34.14
4 17.575 17.575 35.15
我需要两个附加步骤:
我试图添加列,但结果不正确。 跟随有关Total和Grand Total的一些链接,我没有得到令人满意的结果。
我很难理解大熊猫,我要求经验丰富的同事帮忙。
首先获得正确的pivot
。
In [404]: values = ['ArTotal','ArCarr','ArProd']
In [405]: table = pd.pivot_table(df, values=values, index=['Quadra','Talhao','Variedade'],
fill_value=0).reset_index(level=-1)
获取总计
In [406]: Gt = table[values].sum()
获取Quadra
级总计
In [407]: St = table.sum(level='Quadra')
使用append
重塑table
In [408]: (table.append(
St.assign(Talhao='Total').set_index('Talhao', append=True)
).sort_index()
.append(pd.DataFrame([Gt.values], columns=Gt.index,
index=pd.MultiIndex.from_tuples([('Grand Total', '')],
names=['Quadra', 'Talhao']))
).fillna(''))
Out[408]:
ArCarr ArProd ArTotal Variedade
Quadra Talhao
1 1 16.060 16.060 32.12 RB1
2 16.565 16.565 33.13 RB2
Total 32.625 32.625 65.25
2 3 17.070 17.070 34.14 RB3
4 17.575 17.575 35.15 RB4
Total 34.645 34.645 69.29
Grand Total 67.270 67.270 134.54
细节
In [409]: table
Out[409]:
Variedade ArCarr ArProd ArTotal
Quadra Talhao
1 1 RB1 16.060 16.060 32.12
2 RB2 16.565 16.565 33.13
2 3 RB3 17.070 17.070 34.14
4 RB4 17.575 17.575 35.15
In [410]: Gt
Out[410]:
ArTotal 134.54
ArCarr 67.27
ArProd 67.27
dtype: float64
In [411]: St
Out[411]:
ArCarr ArProd ArTotal
Quadra
1 32.625 32.625 65.25
2 34.645 34.645 69.29
我认为John的解决方案击败了我,但是根据您当前的输出,您不能使用数据透视表来做到这一点,您可以使用分组数据的列表理解来执行一系列步骤,然后附加总和来做到这一点,即
cols = ['Fazenda','Variedade','Quadra','Talhao']
ndf = pd.concat([i.append(i.drop(cols,1).sum(),1) for _,i in df.groupby('Quadra')])
ndf['Talhao'] = ndf[['Talhao']].fillna('Total')
ndf['Quadra'] = ndf['Quadra'].ffill()
new = ndf.set_index(['Quadra','Talhao']).drop(['Fazenda'],1)
new = new.append(pd.DataFrame(df.sum()).T.drop(cols,1).set_index(pd.MultiIndex.from_tuples([('Grand Total', '')]))).fillna('')
输出:
ArCarr ArProd ArTotal Variedade Quadra Talhao 1.0 1.0 16.060 16.060 32.12 RB1 2.0 16.565 16.565 33.13 RB2 Total 32.625 32.625 65.25 2.0 3.0 17.070 17.070 34.14 RB3 4.0 17.575 17.575 35.15 RB4 Total 34.645 34.645 69.29 Grand Total 67.270 67.270 134.54
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.