简体   繁体   English

Pandas 子和总 pivot 表与多索引

[英]Pandas sub- and total of pivot tables with multiindex

I am trying to add a new column with subtotals and a final column with totals.我正在尝试添加一个带有小计的新列和一个带有总计的最后一列。 For example,例如,

df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
               "B": ["one", "one", "one", "two", "two","one", "one", "two", "two"],
               "C": ["small", "large", "large", "small","small", "large", "small", "small", "large"],
               "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
               "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})

ie: IE:

     A    B      C  D  E
0  foo  one  small  1  2
1  foo  one  large  2  4
2  foo  one  large  2  5
3  foo  two  small  3  5
4  foo  two  small  3  6
5  bar  one  large  4  6
6  bar  one  small  5  8
7  bar  two  small  6  9
8  bar  two  large  7  9

Now I pivot:现在我 pivot:

table = pd.pivot_table(df, values=['D',"E"], index=['A'],columns=['C'])

and add the totals:并添加总数:

table['total'] = table.sum(axis=1)
for t in ["D", "E"]:
   table[t, "partial_total"]  = table[t].sum(axis=1)

While this numerically works, visually it's annoying.虽然这在数字上有效,但在视觉上它很烦人。 I would like to have all the data for D (including the partial_total ), then E , then the total .我想获得D的所有数据(包括partial_total ),然后是E ,然后是total Here's my resulting df:这是我得到的df:

        D               E                total             D             E
C   large     small large     small            partial_total partial_total
A                                                                         
bar   5.5  5.500000   7.5  8.500000  27.000000     11.000000     16.000000
foo   2.0  2.333333   4.5  4.333333  13.166667      4.333333      8.833333

so所以

how do I group together the values for the same (top level) columns ?如何将相同(顶级)列的值组合在一起

Try this using pd.concat :使用pd.concat试试这个:

table = pd.pivot_table(df, values=['D',"E"], index=['A'],columns=['C'])
table.columns = [f'{i}_{j}' for i, j in table.columns]
pd.concat([table,
           table.sum(axis=1, level=0).add_suffix('_partial_total'),
           table.sum(axis=1).to_frame(name='total')], axis=1)

Output: Output:

     D_large   D_small  E_large   E_small  D_large_partial_total  D_small_partial_total  E_large_partial_total  E_small_partial_total      total
A                                                                                                                                               
bar      5.5  5.500000      7.5  8.500000                    5.5               5.500000                    7.5               8.500000  27.000000
foo      2.0  2.333333      4.5  4.333333                    2.0               2.333333                    4.5               4.333333  13.166667

You can pivot with margin :您可以 pivot 有margin

new_df = (df.pivot_table(index='A', columns='C', 
                         values=['D','E'], aggfunc='sum',
                         margins=True, margins_name='partial_total')
   .assign(total=lambda x: x.loc[:, (slice(None),'partial_total')].sum(1))
)

Output: Output:

                D                               E                               total
C               large   small   partial_total   large   small   partial_total   
A                           
bar             11      11      22              15      17      32              54
foo             4       7       11              9       13      22              33
partial_total   15      18      33              24      30      54              87

Trying to perform operations before the pivot_table尝试在pivot_table之前执行操作

g = df.groupby(['A', 'C'])[['D', 'E']]

d = (g.sum()/g.count()).reset_index()
m = d.groupby('A', as_index=False).sum().assign(C='partial')

final = pd.concat([m, d]).pivot_table(index='A', columns='C')

        D                          E                     
C   large     small    partial large     small    partial
A                                                        
bar   5.5  5.500000  11.000000   7.5  8.500000  16.000000
foo   2.0  2.333333   4.333333   4.5  4.333333   8.833333

To answer specifically your last question具体回答你的最后一个问题

how do I group together the values for the same (top level) columns?如何将相同(顶级)列的值组合在一起?

You may just sort_index你可能只是sort_index

table.sort_index(axis=1)

        D                             E                              total
C   large partial_total     small large partial_total     small           
A                                                                         
bar   5.5     11.000000  5.500000   7.5     16.000000  8.500000  27.000000
foo   2.0      4.333333  2.333333   4.5      8.833333  4.333333  13.166667

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM