簡體   English   中英

將一個帶有 MultiIndex 列的 pandas DataFrame 分組為axis = 0

[英]Group a pandas DataFrame with MultiIndex columns for axis=0

我在列上有一個帶有MultiIndexpandas DataFrame ,並希望按級別baz下所有列的值對其進行分組。 雖然這對於沒有MultiIndexDataFrame很簡單( df.groupby(your_cols) - 完成。)我找不到MultiIndex列的直觀解決方案。

import pandas as pd

iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]]
idx = pd.MultiIndex.from_product(iterables, names=["first", "second"])
data = [
    [0, 0, 1, 1, 2, 3, 0, 0],
    [0, 0, 1, 1, 3, 2, 0, 0],
    [0, 1, 2, 3, 1, 2, 0, 0],
    [1, 0, 2, 3, 0, 3, 0, 0],
]
df_with_multi = pd.DataFrame(data=data, columns=idx)
df_without_multi = df_with_multi.copy()
df_without_multi.columns = df_without_multi.columns.map("|".join).str.strip("|")

baz_cols = [c for c in df_without_multi if c.startswith("baz|")]
easy_peasy_grouping = df_without_multi.groupby(baz_cols).first()
print(easy_peasy_grouping)

# What I would have expected but it gives an error
not_so_easy_grouping = df_with_multi.groupby("baz").first()
# ValueError: Grouper for 'baz' not 1-dimensional

輸入:

first  bar     baz     foo     qux    
second one two one two one two one two
0        0   0   1   1   2   3   0   0
1        0   0   1   1   3   2   0   0
2        0   1   2   3   1   2   0   0
3        1   0   2   3   0   3   0   0

預期輸出:

first  bar     foo     qux    
second one two one two one two
baz
one two
1   1    0   0   2   3   0   0
2   3    0   1   1   2   0   0

以下似乎有效:

cols = [("baz", c) for c in df_with_multi["baz"].columns]
not_so_easy_grouping = df_with_multi.groupby(cols).first()

輸出:

first                 bar     foo     qux
second                one two one two one two
(baz, one) (baz, two)
1          1            0   0   2   3   0   0
2          3            0   1   1   2   0   0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM