简体   繁体   English

Python Pandas:MultiIndex groupby 第二级列

[英]Python Pandas: MultiIndex groupby second level of columns

I'm trying to group rows by multiple columns.我正在尝试按多列对行进行分组。 What I want to achieve can be illustrated by this small example:这个小例子可以说明我想要实现的目标:

import pandas as pd

col_index = pd.MultiIndex.from_arrays([['A','A','B','B'],['a','b','c','d']])

df = pd.DataFrame([ [1,2,3,3],
                    [4,2,2,2],
                    [6,4,2,2],
                    [1,2,4,4],
                    [3,8,4,4],
                    [1,2,3,3]], columns = col_index)

DataFrame created by this looks like this:由此创建的 DataFrame 如下所示:

   A     B   
   a  b  c  d
0  1  2  3  3
1  4  2  2  2
2  6  4  2  2
3  1  2  4  4
4  3  8  4  4
5  1  2  3  3

I would like to group by 'c' and 'd', actually whole 'B' This gives me "KeyError: 'c' "我想按 'c' 和 'd' 分组,实际上是整个 'B' 这给了我“KeyError: 'c'”

#something like this
df.groupby(['c','d'], axis = 1, level = 1)
#or like this
df.groupby('B', axis = 1, level = 0)

I tried searching for answer but I can't seem to find any.我尝试寻找答案,但似乎找不到任何答案。

Can somebody tell me what I'm doing wrong?有人可以告诉我我做错了什么吗?

This is one way of doing it by resetting the columns first:这是通过首先重置列来执行此操作的一种方法:

df.set_axis(df.columns.droplevel(0), axis=1,inplace=False).groupby(['c','d']).sum()
Out[531]: 
      a   b
c d        
2 2  10   6
3 3   2   4
4 4   4  10

您还可以明确指定 2 级多指标。

df.groupby([("B","c"), ("B", "d")])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM