简体   繁体   English

从 dataframe 中绘制一个分组条形图,该 Z6A8064B5DF47C55057DZ 源自对多个属性分组的 sql 查询

[英]draw a grouped bar graph from a dataframe originated from an sql query grouped on multiple attributes

I'm relatively new to pandas and numpy我对 pandas 和 numpy 比较陌生

what I'm trying to achieve is:我想要实现的是:

  • starting from a dataframe that originated from an sql query that groups errors by a tag_name (defining the type of task is originating it), the error message and the occurrences of that particular error for the specific meta-tag从 dataframe 开始,该查询源自 sql 查询,该查询按 tag_name 分组错误(定义任务类型是源自它),错误消息和特定元标记的特定错误的发生
  • build a grouped bar graph in which every group in the x axis represents a meta-tag, while each bar is an error message and the bar height determined by the number of occurrences构建一个分组条形图,其中 x 轴上的每个组都代表一个元标记,而每个条形图是一条错误消息,条形图的高度由出现的次数决定

Following there is a sample dataframe:下面是一个示例 dataframe:

        name                                           message  occurred
0  meta-tag1                                       InvalidPlan         1
1  meta-tag1  Maximun number of attempts at planning surpassed       276
2  meta-tag1                               Rescheduling worker       275
3  meta-tag2                                       InvalidPlan        18
4  meta-tag3  Maximun number of attempts at planning surpassed        22

I can't seem to find a solution that allows me to produce the result I want.我似乎找不到让我产生我想要的结果的解决方案。

At first I used np.unique to build 2 lists containing the unique meta-tags and unique error起初,我使用 np.unique 构建了 2 个包含唯一元标记和唯一错误的列表

Then generating a list of dataframes filtered for meta-tag, then generating an array from each sub-dataframe containing only the occurences per error and I tried feeding to a pyplot feeding it the list of arrays with the occurrencies, the unique error-messages as columns and unique meta-tags as index, but I couldn't get it working and I am pretty sure it's the wrong approach at it.然后生成为元标记过滤的数据帧列表,然后从每个子数据帧生成一个数组,其中仅包含每个错误的出现次数,我尝试将其提供给一个 pyplot,将 arrays 列表提供给它,其中的唯一错误消息为列和唯一的元标记作为索引,但我无法让它工作,我很确定这是错误的方法。

I'm pretty sure it can be achieved only by manipulating the dataframe in the correct way, which it's pretty hard for me at my current level, any suggestion is really really welcome.我很确定它只能通过以正确的方式操作 dataframe 来实现,这对我目前的水平来说非常困难,任何建议都非常受欢迎。

I think this is what you want: Let's call your data frame df我认为这就是您想要的:让我们将您的数据框称为df

t=df.groupby(['name', 'message'])['occurred'].sum().unstack('message').fillna(0)

t.plot(kind='bar', stacked=True)

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM