简体   繁体   English

使用Pandas将列添加到数据透视表

[英]Add columns to pivot table with pandas

I have the table as follow: 我的桌子如下:

import pandas as pd
import numpy as np

#simple table
fazenda = [6010,6010,6010,6010]
quadra = [1,1,2,2]
talhao = [1,2,3,4]
arTotal = [32.12,33.13,34.14,35.15]
arCarr = [i/2 for i in arTotal]
arProd = [i/2 for i in arTotal]
varCan = ['RB1','RB2','RB3','RB4']
data = list(zip(fazenda,quadra,talhao,arTotal,arCarr,arProd,varCan))

#Pandas DataFrame
df = pd.DataFrame(data=data,columns=['Fazenda','Quadra','Talhao','ArTotal','ArCarr','ArProd','Variedade'])

#Pivot Table
table = pd.pivot_table(df, values=['ArTotal','ArCarr','ArProd'],index=['Quadra','Talhao'], fill_value=0)

print(table)

resulting in this: 结果是:

               ArCarr  ArProd  ArTotal
Quadra Talhao                         
1      1       16.060  16.060    32.12
       2       16.565  16.565    33.13
2      3       17.070  17.070    34.14
       4       17.575  17.575    35.15

I need two aditional steps: 我需要两个附加步骤:

  1. Add the Subtotal and Grand Total for 'ArTotal', 'ArCarr' e 'ArProd' fields 为“ ArTotal”,“ ArCarr”和“ ArProd”字段添加小计和总计
  2. Add 'Variedade' field to table 在表格中添加“变量”字段

想要的结果

I tried to add the column but the result was incorrect. 我试图添加列,但结果不正确。 Following some links about Total and Grand Total, I did not get the satisfactory result. 跟随有关Total和Grand Total的一些链接,我没有得到令人满意的结果。

I'm having a hard time understanding pandas, I ask for help from more experienced colleagues. 我很难理解大熊猫,我要求经验丰富的同事帮忙。

Get the pivot right first. 首先获得正确的pivot

In [404]: values = ['ArTotal','ArCarr','ArProd']

In [405]: table = pd.pivot_table(df, values=values, index=['Quadra','Talhao','Variedade'], 
                                 fill_value=0).reset_index(level=-1)

Get Grand totals 获取总计

In [406]: Gt = table[values].sum()

Get Quadra level totals 获取Quadra级总计

In [407]: St = table.sum(level='Quadra')

Using append reshape the table 使用append重塑table

In [408]: (table.append(
                 St.assign(Talhao='Total').set_index('Talhao', append=True)
                ).sort_index()
                .append(pd.DataFrame([Gt.values], columns=Gt.index,
                                     index=pd.MultiIndex.from_tuples([('Grand Total', '')],
                                     names=['Quadra', 'Talhao']))
                ).fillna(''))
Out[408]:
                    ArCarr  ArProd  ArTotal Variedade
Quadra      Talhao
1           1       16.060  16.060    32.12       RB1
            2       16.565  16.565    33.13       RB2
            Total   32.625  32.625    65.25
2           3       17.070  17.070    34.14       RB3
            4       17.575  17.575    35.15       RB4
            Total   34.645  34.645    69.29
Grand Total         67.270  67.270   134.54

Details 细节

In [409]: table
Out[409]:
              Variedade  ArCarr  ArProd  ArTotal
Quadra Talhao
1      1            RB1  16.060  16.060    32.12
       2            RB2  16.565  16.565    33.13
2      3            RB3  17.070  17.070    34.14
       4            RB4  17.575  17.575    35.15

In [410]: Gt
Out[410]:
ArTotal    134.54
ArCarr      67.27
ArProd      67.27
dtype: float64

In [411]: St
Out[411]:
        ArCarr  ArProd  ArTotal
Quadra
1       32.625  32.625    65.25
2       34.645  34.645    69.29

I think John's solution beats me, but based on your current output you cant do that with pivot table you can have a series of steps using list comprehension of grouped data and then append the sums to do that ie 我认为John的解决方案击败了我,但是根据您当前的输出,您不能使用数据透视表来做到这一点,您可以使用分组数据的列表理解来执行一系列步骤,然后附加总和来做到这一点,即

cols = ['Fazenda','Variedade','Quadra','Talhao']
ndf = pd.concat([i.append(i.drop(cols,1).sum(),1) for _,i in df.groupby('Quadra')])

ndf['Talhao'] = ndf[['Talhao']].fillna('Total')
ndf['Quadra'] = ndf['Quadra'].ffill()

new = ndf.set_index(['Quadra','Talhao']).drop(['Fazenda'],1)

new = new.append(pd.DataFrame(df.sum()).T.drop(cols,1).set_index(pd.MultiIndex.from_tuples([('Grand Total', '')]))).fillna('')

Output: 输出:

ArCarr  ArProd  ArTotal Variedade
Quadra      Talhao                                   
1.0         1.0     16.060  16.060    32.12       RB1
            2.0     16.565  16.565    33.13       RB2
            Total   32.625  32.625    65.25          
2.0         3.0     17.070  17.070    34.14       RB3
            4.0     17.575  17.575    35.15       RB4
            Total   34.645  34.645    69.29          
Grand Total         67.270  67.270   134.54

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM