[英]How to add sub-total columns to a multilevel columns dataframe?
我有一個 dataframe,有 3 個級別的多索引列:
quarter Q1 Q2 Totals
year 2021 2022 2021 2022
qty orders qty orders qty orders qty orders qty orders
month name
January 40 2 5 1 1 2 0 0 46 5
February 20 8 2 3 4 6 0 0 26 17
March 2 10 7 4 3 3 0 0 12 17
Totals 62 20 14 8 8 11 0 0 84 39
按級別 (0,2) 進行分組后,我得到以下小計 dataframe:
quarter Q1 Q2 Totals
qty orders qty orders qty orders
month name
January 45 3 1 2 46 5
February 22 10 4 6 26 16
March 9 14 3 3 12 17
Totals 76 28 8 11 84 39
我需要將第二個插入第一個,而不打亂列、級別或索引,以便我得到以下 dataframe:
quarter Q1 Q2 Totals
year 2021 2022 Subtotal 2021 2022 Subtotal
qty orders qty orders qty orders qty orders qty orders qty orders qty orders
month name
January 40 2 5 1 45 3 1 2 0 0 1 2 46 5
February 20 8 2 3 22 10 4 6 0 0 4 6 26 16
March 2 10 7 4 9 14 3 3 0 0 3 3 12 17
Totals 62 20 14 8 76 28 8 11 0 0 8 11 84 39
我該怎么做呢?
使用您最初的 dataframe(在 groupby 之前):
import pandas as pd
df = pd.DataFrame(
[
[40, 2, 5, 1, 1, 2, 0, 0],
[20, 8, 2, 3, 4, 6, 0, 0],
[2, 10, 7, 4, 3, 3, 0, 0],
[62, 20, 14, 8, 8, 11, 0, 0],
],
columns=pd.MultiIndex.from_product(
[("Q1", "Q2"), ("2021", "2022"), ("qty", "orders")]
),
index=["January", "February", "March", "Totals"],
)
這是一種方法(使用來自 Python 標准庫的itertools模塊的產品,否則嵌套的 for 循環也是可能的):
# Add new columns
for level1, level2 in product(["Q1", "Q2"], ["qty", "orders"]):
df.loc[:, (level1, "subtotal", level2)] = (
df.loc[:, (level1, "2021", level2)] + df.loc[:, (level1, "2022", level2)]
)
# Sort columns
df = df.reindex(
pd.MultiIndex.from_product(
[("Q1", "Q2"), ("2021", "2022", "subtotal"), ("qty", "orders")]
),
axis=1,
)
然后:
print(df)
# Output
Q1 Q2 \
2021 2022 subtotal 2021 2022
qty orders qty orders qty orders qty orders qty orders
January 40 2 5 1 45 3 1 2 0 0
February 20 8 2 3 22 11 4 6 0 0
March 2 10 7 4 9 14 3 3 0 0
Totals 62 20 14 8 76 28 8 11 0 0
subtotal
qty orders
January 1 2
February 4 6
March 3 3
Totals 8 11
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.