簡體   English   中英

如何將小計列添加到多級列 dataframe?

[英]How to add sub-total columns to a multilevel columns dataframe?

我有一個 dataframe,有 3 個級別的多索引列:

quarter           Q1                        Q2                        Totals
year              2021        2022           2021         2022                      
                 qty orders  qty orders    qty orders   qty orders   qty orders
month name                                       
January          40  2        5   1         1   2         0 0             46  5
February         20  8        2   3         4   6         0 0             26  17
March            2  10        7   4         3   3         0 0             12  17
Totals           62 20       14   8         8   11        0 0             84  39

按級別 (0,2) 進行分組后,我得到以下小計 dataframe:

quarter           Q1           Q2          Totals                     
                 qty orders  qty orders    qty orders  
month name                                       
January          45  3        1   2         46   5     
February         22  10       4   6         26   16     
March            9  14        3   3         12   17   
Totals           76 28        8   11        84   39

我需要將第二個插入第一個,而不打亂列、級別或索引,以便我得到以下 dataframe:

quarter       Q1                                   Q2                        Totals
year        2021        2022      Subtotal    2021        2022     Subtotal                 
            qty orders qty orders qty orders qty orders qty orders qty orders qty orders
month name                                       
January     40  2       5   1     45   3       1  2       0  0       1  2     46  5
February    20  8       2   3     22   10      4  6       0  0       4  6     26  16
March       2  10       7   4     9    14      3  3       0  0       3  3     12  17
Totals      62 20      14   8     76   28      8  11      0  0       8  11    84 39

我該怎么做呢?

使用您最初的 dataframe(在 groupby 之前):

import pandas as pd


df = pd.DataFrame(
    [
        [40, 2, 5, 1, 1, 2, 0, 0],
        [20, 8, 2, 3, 4, 6, 0, 0],
        [2, 10, 7, 4, 3, 3, 0, 0],
        [62, 20, 14, 8, 8, 11, 0, 0],
    ],
    columns=pd.MultiIndex.from_product(
        [("Q1", "Q2"), ("2021", "2022"), ("qty", "orders")]
    ),
    index=["January", "February", "March", "Totals"],
)

這是一種方法(使用來自 Python 標准庫的itertools模塊的產品,否則嵌套的 for 循環也是可能的):

# Add new columns
for level1, level2 in product(["Q1", "Q2"], ["qty", "orders"]):
    df.loc[:, (level1, "subtotal", level2)] = (
        df.loc[:, (level1, "2021", level2)] + df.loc[:, (level1, "2022", level2)]
    )

# Sort columns
df = df.reindex(
    pd.MultiIndex.from_product(
        [("Q1", "Q2"), ("2021", "2022", "subtotal"), ("qty", "orders")]
    ),
    axis=1,
)

然后:

print(df)
# Output
           Q1                                      Q2                     \
         2021        2022        subtotal        2021        2022
          qty orders  qty orders      qty orders  qty orders  qty orders   
January    40      2    5      1       45      3    1      2    0      0   
February   20      8    2      3       22     11    4      6    0      0   
March       2     10    7      4        9     14    3      3    0      0   
Totals     62     20   14      8       76     28    8     11    0      0   


         subtotal
              qty orders  
January         1      2  
February        4      6  
March           3      3  
Totals          8     11  

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM