简体   繁体   English

熊猫绘制“分组依据”结果的子图

[英]Pandas plot subplots of a 'group by' result

I struggle with my (poor) Pandas knowledge, as I try to get a bar plot on a hierachial index by a group by operation. 当我尝试按操作按组获取层级索引上的条形图时,我对我(差)的Pandas知识感到困惑。

My data look like this 我的数据看起来像这样

id, val, cat1, cat2

Then I create a hierachical index: 然后创建一个层次索引:

df_group = df_len.groupby(['cat1','cat2'])

I would like to get a hbar plot per cat1 object that lists all cat2 objects that lists the values of all cat1 objects within. 我想为每个cat1对象获取一份hbar图 ,其中列出了所有cat2对象,其中列出了其中所有cat1对象的值。

None of my approaches worked: 我的方法均无效:

  • df_group.plot(...)
  • for name, group in df_group: .... group.plot(...)
  • df_group.xs(...) experiments df_group.xs(...)实验

The result should look somewhat like this one 结果应该看起来像这样一个 在此处输入图片说明

I guess I just lack of knowledge of pandas , matplotlib , ... -internals and it's not that difficult to plot a few 100 items (cat2<10, cat1=30) 我想我只是不了解pandasmatplotlib ,... -internals的知识,绘制几百个项目并不难(cat2 <10,cat1 = 30)

.

I'd recommend using seaborn to do this type of faceted plot. 我建议使用seaborn进行此类多面图绘制。 Doing it in matplotlib is very tricky as the library is quite low level. 在matplotlib中执行此操作非常棘手,因为该库的级别很低。 Seaborn excels at this use case. Seaborn在此用例方面表现出色。

Ok guys, so here it's how I solved it finally: 好的,这就是我终于解决的方法:

dfc = df_len.groupby(['cat1','cat2']).count().reset_index()
dfp=dfc.pivot(index="cat1",columns="cat2")
dfp.columns = dfp.columns.get_level_values(1)
dfp.plot(kind='bar', figsize=(15, 5), stacked=True);

In short: I used a pivot table to transpose my matrix and then I was able to plot the single cols automaticly, at example 2 here . 简而言之:我使用数据透视表转置矩阵,然后能够自动绘制单个cols,在此处的示例2中

Not so tricky in matplotlib , see: matplotlib不是那么棘手,请参阅:

In [54]:

print df
  cat1  cat2       val
0    A     1  0.011887
1    A     2  0.880121
2    A     3  0.034244
3    A     4  0.530230
4    B     1  0.510812
5    B     2  0.405322
6    B     3  0.406259
7    B     4  0.406405
In [55]:

col_list = ['r', 'g']
ax = plt.subplot(111)
for (idx, (grp, val)) in enumerate(df.groupby('cat1')):
    ax.bar(val.cat2+0.25*idx-0.25, 
           val.val, width=0.25,  
           color=col_list[idx], 
           label=grp)
plt.legend()

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM