简体   繁体   English

熊猫数据框的多个条形图

[英]multiple bar plots from pandas dataframe

So I have a data frame like 所以我有一个数据框

exp_name, index, items, clicks
"foo",0, "apple",200
"foo",0, "banana", 300
"foo",0,"melon",220
"foo",1, "apple", 10
"foo",1,"peach", 20
"bar",0, "apple",400
"bar",0,'banana', 500
"bar",0, "melon",240
"bar",1,"apple",500

and so on 等等

I want to plot... for each experiment name... bar plots of number of clicks for each item in each index but colored by index. 我想为每个实验名称绘制...每个索引中每个项目的点击次数条形图,但按索引进行着色。 So basically.. plot 1.. for experiment "foo", a bar plots.. where index == 0.. all the barplots for index 0 in one color.. index 1 in another color. 因此,基本上..绘制1 ..对于实验“ foo”,用条形图绘制.. index == 0 ..一种颜色的索引0的所有条形图..另一种颜色的索引1。

if the item is missing (for example peach is in "foo", 1 but not in any other place) replace "peach" to be zero in other places. 如果缺少该项目(例如,桃子在“ foo”中,则为1,但在其他任何地方都没有),在其他位置将“ peach”替换为零。

I copy/paste your data into a txt file called 'test.txt' and rename "index" as "status" to avoid confusion with the DataFrame index. 我将您的数据复制/粘贴到名为“ test.txt”的txt文件中,并将“ index”重命名为“ status”,以避免与DataFrame索引混淆。 Then I use the Seaborn library to make barplots with the contingencies you mention (and as I understand them). 然后,我使用Seaborn库对您提到的突发事件(以及据我所知)进行条形图绘制。 I use subplots rather than use color to set apart "status" cause I personally think it looks cleaner, but I use colors below since that's what you asked about. 我使用子图而不是使用颜色来区分“状态”,因为我个人认为它看起来更干净,但是我使用下面的颜色,因为这就是您所要的。

import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt

df = pd.read_csv('test.txt')  
fig, ax = sns.plt.subplots(1, 1, figsize=(7,5))
sns.factorplot(x="items", y="clicks", hue="exp_name", col="status", data=df, kind="bar")
plt.show()

Gives the following: 提供以下内容: 在此处输入图片说明

If you really want to distinguish "index" (what I call "status") by color, you might define a new variable which combines "exp_name" with "status" 如果您真的想按颜色区分“索引”(我称“状态”),则可以定义一个新变量,该变量将“ exp_name”与“ status”组合在一起

df['exp'] = df.exp_name + df.status.astype(str)
sns.factorplot(x="items", y="clicks", hue="exp", data=df, kind="bar")

Gives something like this 给这样的东西

在此处输入图片说明

Check out the docs for seaborn if you have more questions. 如果您还有其他问题,请查看Seaborn文档。 It's a really great library for categorical data. 这是一个非常好的分类数据库。 Changing the legend labels and other settings follows matplotlib conventions. 更改图例标签和其他设置遵循matplotlib约定。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM