简体   繁体   English

每个cols的Pandas Pivot表小计

[英]Pandas Pivot table subtotals per each cols

Can I achieve my Desired Output(given below) or something similar with the following Datasets using pivot_table in pandas. 我可以在Pandas中使用ivot_table实现所需的输出(如下所示)或与以下数据集类似的东西吗? I am trying to do something like: 我正在尝试做类似的事情:

pivot_table(df, rows=['region'], cols=['area','distributor','salesrep'], 
            aggfunc=np.sum, margins=True).stack(['area','distributor','salesrep'])

but I am only getting subtotals per region, if I move area from cols to rows then I will only get subtotals per area. 但是我只得到每个区域的小计,如果我将区域从列移动到行,那么我只会得到每个区域的小计。

Datasets: 数据集:

region   area            distributor     salesrep       sales    invoice_count
Central  Butterworth     HIN MARKETING   TLS            500      25
Central  Butterworth     HIN MARKETING   TLS            500      25
Central  Butterworth     HIN MARKETING   OSE            500      25
Central  Butterworth     HIN MARKETING   OSE            500      25
Central  Butterworth     KWANG HENGG     TCS            500      25
Central  Butterworth     KWANG HENGG     TCS            500      25
Central  Butterworth     KWANG HENG      LBH            500      25
Central  Butterworth     KWANG HENG      LBH            500      25
Central  Ipoh            SGH EDERAN      CHAN           500      25
Central  Ipoh            SGH EDERAN      CHAN           500      25
Central  Ipoh            SGH EDERAN      KAMACHI        500      25
Central  Ipoh            SGH EDERAN      KAMACHI        500      25
Central  Ipoh            CORE SYN        LILIAN         500      25
Central  Ipoh            CORE SYN        LILIAN         500      25
Central  Ipoh            CORE SYN        TEOH           500      25
Central  Ipoh            CORE SYN        TEOH           500      25
East     JB              LEI WAH         NF05           500      25
East     JB              LEI WAH         NF05           500      25
East     JB              LEI WAH         NF06           500      25
East     JB              LEI WAH         NF06           500      25
East     JB              WONDER F&B      SEREN          500      25
East     JB              WONDER F&B      SEREN          500      25
East     JB              WONDER F&B      MONC           500      25
East     JB              WONDER F&B      MONC           500      25
East     PJ              PENGEDAR        NORM           500      25
East     PJ              PENGEDAR        NORM           500      25
East     PJ              PENGEDAR        SIMON          500      25
East     PJ              PENGEDAR        SIMON          500      25
East     PJ              HEBAT           OGI            500      25
East     PJ              HEBAT           OGI            500      25
East     PJ              HEBAT           MIGI           500      25
East     PJ              HEBAT           MIGI           500      25

Desired Output: 所需输出:

region       area          distributor       salesrep             invoice_count sales
Grand Total                                                                 800 16000
Central      Central Total                                                  400  8000
Central      Butterworth   Butterworth Total                                200  4000
Central      Butterworth   HIN MARKETING     HIN MARKETING Total            100  2000
Central      Butterworth   HIN MARKETING     OSE                             50  1000
Central      Butterworth   HIN MARKETING     TLS                             50  1000
Central      Butterworth   KWANG HENG        KWANG HENG Total               100  2000
Central      Butterworth   KWANG HENG        LBH                             50  1000
Central      Butterworth   KWANG HENG        TCS                             50  1000
Central      Ipoh          Ipoh Total                                       200  4000
Central      Ipoh          CORE SYN          CORE SYN Total                 100  2000
Central      Ipoh          CORE SYN          LILIAN                          50  1000
Central      Ipoh          CORE SYN          TEOH                            50  1000
Central      Ipoh          SGH EDERAN        SGH EDERAN Total               100  2000
Central      Ipoh          SGH EDERAN        CHAN                            50  1000
Central      Ipoh          SGH EDERAN        KAMACHI                         50  1000
East         East Total                                                     400  8000
East         JB            JB Total                                         200  4000
East         JB            LEI WAH           LEI WAH Total                  100  2000
East         JB            LEI WAH           NF05                            50  1000
East         JB            LEI WAH           NF06                            50  1000
East         JB            WONDER F&B        WONDER F&B Total               100  2000
East         JB            WONDER F&B        MONC                            50  1000
East         JB            WONDER F&B        SEREN                           50  1000
East         PJ            PJ Total                                         200  4000
East         PJ            HEBAT             HEBAT Total                    100  2000
East         PJ            HEBAT             MIGI                            50  1000
East         PJ            HEBAT             OGI                             50  1000
East         PJ            PENGEDAR          PENDEGAR Total                 100  2000
East         PJ            PENGEDAR          NORM                            50  1000
East         PJ            PENGEDAR          SIMON                           50  1000

We could use groupby instead of pivot_table : 我们可以使用groupby而不是pivot_table

import numpy as np
import pandas as pd


def label(ser):
    return '{s} Total'.format(s=ser)

filename = 'data.txt'
df = pd.read_table(filename, delimiter='\t')

total = pd.DataFrame({'region': ['Grand Total'],
                      'invoice_count': df['invoice_count'].sum(),
                      'sales': df['sales'].sum()})
total['total_rank'] = 1

region_total = df.groupby(['region'], as_index=False).sum()
region_total['area'] = region_total['region'].apply(label)
region_total['region_rank'] = 1

area_total = df.groupby(['region', 'area'], as_index=False).sum()
area_total['distributor'] = area_total['area'].apply(label)
area_total['area_rank'] = 1

dist_total = df.groupby(
    ['region', 'area', 'distributor'], as_index=False).sum()
dist_total['salesrep'] = dist_total['distributor'].apply(label)

rep_total = df.groupby(
    ['region', 'area', 'distributor', 'salesrep'], as_index=False).sum()

# UNION the DataFrames into one DataFrame
result = pd.concat([total, region_total, area_total, dist_total, rep_total])

# Replace NaNs with empty strings
result.fillna({'region': '', 'area': '', 'distributor': '', 'salesrep':
              ''}, inplace=True)

# Reorder the rows
sorter = np.lexsort((
    result['distributor'].rank(),
    result['area_rank'].rank(),
    result['area'].rank(),
    result['region_rank'].rank(),
    result['region'].rank(),
    result['total_rank'].rank()))
result = result.take(sorter)
result = result.reindex(
    columns=['region', 'area', 'distributor', 'salesrep', 'invoice_count', 'sales'])
print(result.to_string(index=False))

yields 产量

      region           area        distributor             salesrep  invoice_count  sales
 Grand Total                                                                   800  16000
     Central  Central Total                                                    400   8000
     Central    Butterworth  Butterworth Total                                 200   4000
     Central    Butterworth      HIN MARKETING  HIN MARKETING Total            100   2000
     Central    Butterworth      HIN MARKETING                  OSE             50   1000
     Central    Butterworth      HIN MARKETING                  TLS             50   1000
     Central    Butterworth         KWANG HENG     KWANG HENG Total            100   2000
     Central    Butterworth         KWANG HENG                  LBH             50   1000
     Central    Butterworth         KWANG HENG                  TCS             50   1000
     Central           Ipoh         Ipoh Total                                 200   4000
     Central           Ipoh           CORE SYN       CORE SYN Total            100   2000
     Central           Ipoh           CORE SYN               LILIAN             50   1000
     Central           Ipoh           CORE SYN                 TEOH             50   1000
     Central           Ipoh         SGH EDERAN     SGH EDERAN Total            100   2000
     Central           Ipoh         SGH EDERAN                 CHAN             50   1000
     Central           Ipoh         SGH EDERAN              KAMACHI             50   1000
        East     East Total                                                    400   8000
        East             JB           JB Total                                 200   4000
        East             JB            LEI WAH        LEI WAH Total            100   2000
        East             JB            LEI WAH                 NF05             50   1000
        East             JB            LEI WAH                 NF06             50   1000
        East             JB         WONDER F&B     WONDER F&B Total            100   2000
        East             JB         WONDER F&B                 MONC             50   1000
        East             JB         WONDER F&B                SEREN             50   1000
        East             PJ           PJ Total                                 200   4000
        East             PJ              HEBAT          HEBAT Total            100   2000
        East             PJ              HEBAT                 MIGI             50   1000
        East             PJ              HEBAT                  OGI             50   1000
        East             PJ           PENGEDAR       PENGEDAR Total            100   2000
        East             PJ           PENGEDAR                 NORM             50   1000
        East             PJ           PENGEDAR                SIMON             50   1000

I do not know how to get subtotals inside the table, but if you run 我不知道如何在表格中获取小计,但是如果您运行

df.pivot_table(rows=['region','area','distributor','salesrep'],
  aggfunc=np.sum, margins=True)

you will get 你会得到

                                            invoice_count  sales
region  area        distributor   salesrep                      
Central Butterworth HIN MARKETING OSE                  50   1000
                                  TLS                  50   1000
                    KWANG HENG    LBH                  50   1000
                    KWANG HENGG   TCS                  50   1000
        Ipoh        CORE SYN      LILIAN               50   1000
                                  TEOH                 50   1000
                    SGH EDERAN    CHAN                 50   1000
                                  KAMACHI              50   1000
East    JB          LEI WAH       NF05                 50   1000
                                  NF06                 50   1000
                    WONDER F&B    MONC                 50   1000
                                  SEREN                50   1000
        PJ          HEBAT         MIGI                 50   1000
                                  OGI                  50   1000
                    PENGEDAR      NORM                 50   1000
                                  SIMON                50   1000
All                                                   800  16000

If you want totals based on say region and area , you may run 如果您想基于regionarea ,则可以运行

df.pivot_table(rows=['region', 'area'], aggfunc=np.sum, margins=True)

which results in 导致

                     invoice_count  sales
region  area                             
Central Butterworth            200   4000
        Ipoh                   200   4000
East    JB                     200   4000
        PJ                     200   4000
All                            800  16000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM