[英]how to aggregate in pivot table in pandas
I have following dataframe in pandas 我在熊猫中有以下数据框
code date tank nozzle qty amount
123 2018-01-01 1 1 100 0
123 2018-01-01 1 2 0 50
123 2018-01-01 1 2 0 50
123 2018-01-01 1 2 100 0
123 2018-01-02 1 1 0 70
123 2018-01-02 1 1 0 50
123 2018-01-02 1 2 100 0
My desired dataframe is 我想要的数据框是
code date tank nozzle_1_qty nozzle_2_qty nozzle_1_amount nozzle_2_amount
123 2018-01-01 1 100 100 0 100
123 2018-01-02 1 0 100 120 0
I am doing following in pandas.. 我正在熊猫里追随。
df= (df.pivot_table(index=['date', 'tank'], columns='nozzle',
values=['qty','amount']).add_prefix('nozzle_')
.reset_index()
)
But,this does not give me my desired output. 但是,这没有给我我想要的输出。
Default aggregation function in pivot_table
is np.mean
, so is necessary change it to sum
and then flatten MultiIndex
in list comprehension: pivot_table
默认聚合函数为np.mean
,因此有必要将其更改为sum
,然后在列表理解中展平MultiIndex
:
df = df.pivot_table(index=['code','date', 'tank'],
columns='nozzle',
values=['qty','amount'], aggfunc='sum')
#python 3.6+
df.columns = [f'nozzle_{b}_{a}' for a, b in df.columns]
#python bellow
#df.columns = ['nozzle_{}_{}'.format(b,a) for a, b in df.columns]
df = df.reset_index()
print (df)
code date tank nozzle_1_amount nozzle_2_amount nozzle_1_qty \
0 123 2018-01-01 1 0 100 100
1 123 2018-01-02 1 120 0 0
nozzle_2_qty
0 100
1 100
I don't use pivot_table much in pandas, but you can get your result using groupby and some reshaping. 我在熊猫中使用的数据透视表很少,但是您可以使用groupby和一些重塑来获得结果。
df = df.groupby(['code', 'date', 'tank', 'nozzle']).sum().unstack()
The columns will be a MultiIndex that you maybe want to rename. 这些列将是一个您可能要重命名的MultiIndex。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.