[英]Group a pandas DataFrame with MultiIndex columns for axis=0
我在列上有一個帶有MultiIndex
的pandas
DataFrame
,並希望按級別baz
下所有列的值對其進行分組。 雖然這對於沒有MultiIndex
的DataFrame
很簡單( df.groupby(your_cols)
- 完成。)我找不到MultiIndex
列的直觀解決方案。
import pandas as pd
iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]]
idx = pd.MultiIndex.from_product(iterables, names=["first", "second"])
data = [
[0, 0, 1, 1, 2, 3, 0, 0],
[0, 0, 1, 1, 3, 2, 0, 0],
[0, 1, 2, 3, 1, 2, 0, 0],
[1, 0, 2, 3, 0, 3, 0, 0],
]
df_with_multi = pd.DataFrame(data=data, columns=idx)
df_without_multi = df_with_multi.copy()
df_without_multi.columns = df_without_multi.columns.map("|".join).str.strip("|")
baz_cols = [c for c in df_without_multi if c.startswith("baz|")]
easy_peasy_grouping = df_without_multi.groupby(baz_cols).first()
print(easy_peasy_grouping)
# What I would have expected but it gives an error
not_so_easy_grouping = df_with_multi.groupby("baz").first()
# ValueError: Grouper for 'baz' not 1-dimensional
輸入:
first bar baz foo qux
second one two one two one two one two
0 0 0 1 1 2 3 0 0
1 0 0 1 1 3 2 0 0
2 0 1 2 3 1 2 0 0
3 1 0 2 3 0 3 0 0
預期輸出:
first bar foo qux
second one two one two one two
baz
one two
1 1 0 0 2 3 0 0
2 3 0 1 1 2 0 0
以下似乎有效:
cols = [("baz", c) for c in df_with_multi["baz"].columns]
not_so_easy_grouping = df_with_multi.groupby(cols).first()
輸出:
first bar foo qux
second one two one two one two
(baz, one) (baz, two)
1 1 0 0 2 3 0 0
2 3 0 1 1 2 0 0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.