[英]Python Pandas: MultiIndex groupby second level of columns
I'm trying to group rows by multiple columns.我正在尝试按多列对行进行分组。 What I want to achieve can be illustrated by this small example:这个小例子可以说明我想要实现的目标:
import pandas as pd
col_index = pd.MultiIndex.from_arrays([['A','A','B','B'],['a','b','c','d']])
df = pd.DataFrame([ [1,2,3,3],
[4,2,2,2],
[6,4,2,2],
[1,2,4,4],
[3,8,4,4],
[1,2,3,3]], columns = col_index)
DataFrame created by this looks like this:由此创建的 DataFrame 如下所示:
A B
a b c d
0 1 2 3 3
1 4 2 2 2
2 6 4 2 2
3 1 2 4 4
4 3 8 4 4
5 1 2 3 3
I would like to group by 'c' and 'd', actually whole 'B' This gives me "KeyError: 'c' "我想按 'c' 和 'd' 分组,实际上是整个 'B' 这给了我“KeyError: 'c'”
#something like this
df.groupby(['c','d'], axis = 1, level = 1)
#or like this
df.groupby('B', axis = 1, level = 0)
I tried searching for answer but I can't seem to find any.我尝试寻找答案,但似乎找不到任何答案。
Can somebody tell me what I'm doing wrong?有人可以告诉我我做错了什么吗?
This is one way of doing it by resetting the columns first:这是通过首先重置列来执行此操作的一种方法:
df.set_axis(df.columns.droplevel(0), axis=1,inplace=False).groupby(['c','d']).sum()
Out[531]:
a b
c d
2 2 10 6
3 3 2 4
4 4 4 10
您还可以明确指定 2 级多指标。
df.groupby([("B","c"), ("B", "d")])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.