[英]Group a pandas DataFrame with MultiIndex columns for axis=0
I have a pandas
DataFrame
with a MultiIndex
on the columns and want to group it by the values of all columns under the level baz
.我在列上有一个带有
MultiIndex
的pandas
DataFrame
,并希望按级别baz
下所有列的值对其进行分组。 While this is straightforward for a DataFrame
without a MultiIndex
( df.groupby(your_cols)
- done.) I can't find an intuitive solution for MultiIndex
columns.虽然这对于没有
MultiIndex
的DataFrame
很简单( df.groupby(your_cols)
- 完成。)我找不到MultiIndex
列的直观解决方案。
import pandas as pd
iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]]
idx = pd.MultiIndex.from_product(iterables, names=["first", "second"])
data = [
[0, 0, 1, 1, 2, 3, 0, 0],
[0, 0, 1, 1, 3, 2, 0, 0],
[0, 1, 2, 3, 1, 2, 0, 0],
[1, 0, 2, 3, 0, 3, 0, 0],
]
df_with_multi = pd.DataFrame(data=data, columns=idx)
df_without_multi = df_with_multi.copy()
df_without_multi.columns = df_without_multi.columns.map("|".join).str.strip("|")
baz_cols = [c for c in df_without_multi if c.startswith("baz|")]
easy_peasy_grouping = df_without_multi.groupby(baz_cols).first()
print(easy_peasy_grouping)
# What I would have expected but it gives an error
not_so_easy_grouping = df_with_multi.groupby("baz").first()
# ValueError: Grouper for 'baz' not 1-dimensional
Input:输入:
first bar baz foo qux
second one two one two one two one two
0 0 0 1 1 2 3 0 0
1 0 0 1 1 3 2 0 0
2 0 1 2 3 1 2 0 0
3 1 0 2 3 0 3 0 0
Expected Output:预期输出:
first bar foo qux
second one two one two one two
baz
one two
1 1 0 0 2 3 0 0
2 3 0 1 1 2 0 0
The following seems to work:以下似乎有效:
cols = [("baz", c) for c in df_with_multi["baz"].columns]
not_so_easy_grouping = df_with_multi.groupby(cols).first()
Output:输出:
first bar foo qux
second one two one two one two
(baz, one) (baz, two)
1 1 0 0 2 3 0 0
2 3 0 1 1 2 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.