简体   繁体   English

seaborn:按组划分的 dataframe 的条形图

[英]seaborn: barplot of a dataframe by group

I am having difficulty with this.我对此有困难。 I have the results from my initial model (`Unfiltered´), that I plot like so:我从最初的 model(“未过滤”)得到结果,我 plot 是这样的:

df = pd.DataFrame(
    {'class': ['foot', 'bike', 'bus', 'car', 'metro'],
     'Precision': [0.7, 0.66, 0.41, 0.61, 0.11],
     'Recall': [0.58, 0.35, 0.13, 0.89, 0.02],
     'F1-score': [0.64, 0.45, 0.2, 0.72, 0.04]}
)

groups = df.melt(id_vars=['class'], var_name=['Metric'])
sns.barplot(data=groups, x='class', y='value', hue='Metric')

To produce this nice plot:要生成这个漂亮的 plot: 在此处输入图像描述

Now, I obtained a second results from my improved model ( filtered ), so I add a column ( status ) to my df to indicate the results from each model like this:现在,我从改进的 model( filtered )中获得了第二个结果,因此我在我的df中添加了一列( status )以指示每个 model 的结果,如下所示:

df2 = pd.DataFrame(
    {'class': ['foot','foot','bike','bike','bus','bus',
               'car','car','metro','metro'],
 'Precison': [0.7, 0.62, 0.66, 0.96, 0.41, 0.42, 0.61, 0.75, 0.11, 0.3],
 'Recall': [0.58, 0.93, 0.35, 0.4, 0.13, 0.1, 0.89, 0.86, 0.02, 0.01],
 'F1-score': [0.64, 0.74, 0.45, 0.56, 0.2, 0.17, 0.72, 0.8, 0.04, 0.01],
 'status': ['Unfiltered', 'Filtered', 'Unfiltered','Filtered','Unfiltered',
           'Filtered','Unfiltered','Filtered','Unfiltered','Filtered']}
)

df2.head()
    class  Precison  Recall  F1-score   status
0   foot    0.70      0.58    0.64     Unfiltered
1   foot    0.62      0.93    0.74     Filtered
2   bike    0.66      0.35    0.45     Unfiltered
3   bike    0.96      0.40    0.56     Filtered
4   bus     0.41      0.13    0.20     Unfiltered

And I want to plot this, in similar grouping as above (ie foot , bike , bus , car , metro ).我想要 plot 这个,在与上面类似的分组中(即footbikebuscarmetro )。 However, for each of the metrics, I want to place the two values side-by-side.但是,对于每个指标,我想并排放置两个值。 Take for example, the foot group, I would have two bars Precision[Unfiltered, filtered] , then 2 bars for Recall[Unfiltered, filtered] and also 2 bars for F1-score[Unfiltered, filtered] .foot组为例,我将有两个条形Precision[Unfiltered, filtered] ,然后 2 个条形用于Recall[Unfiltered, filtered] ,还有 2 个条形用于F1-score[Unfiltered, filtered] Likewise all other groups.所有其他组也是如此。

My attempt:我的尝试:

group2 = df2.melt(id_vars=['class', 'status'], var_name=['Metric'])
sns.barplot(data=group2, x='class', y='value', hue='Metric')

在此处输入图像描述

Totally not what I want.完全不是我想要的。

You can pass in hue any sequence as long as it has the same length as your data, and assign colours through it.您可以传入任何序列的hue ,只要它与您的数据具有相同的长度,并通过它分配颜色。 So you could try with所以你可以试试

group2 = df2.melt(id_vars=['class', 'status'], var_name=['Metric'])
sns.barplot(data=group2, x='class', y='value', hue=group2[['Metric','status']].agg(tuple, axis=1))
plt.legend(fontsize=7)

But the result is a bit hard to read:但结果有点难以阅读: 在此处输入图像描述

Seaborn grouped barplots don't allow for multiple grouping variables. Seaborn 分组条形图不允许多个分组变量。 One workaround is to recode the two grouping variables (Metric and status) as one variable with 6 levels.一种解决方法是将两个分组变量(指标和状态)重新编码为一个具有 6 个级别的变量。 Another possibility is to use facets.另一种可能性是使用构面。 If you are open to another plotting package, I might recommend plotnine , which allows multiple grouping variables as follows:如果您对另一个绘图 package 持开放态度,我可能会推荐plotnine ,它允许多个分组变量,如下所示:

import plotnine as p9

fig = (
    p9.ggplot(group2)
    + p9.geom_col(
        p9.aes(x="class", y="value", fill="Metric", color="Metric", alpha="status"),
        position=p9.position_dodge(1),
        size=1,
        width=0.5,
    )
    + p9.scale_color_manual(("red", "blue", "green"))
    + p9.scale_fill_manual(("red", "blue", "green"))
)

fig.draw()

This generates the following image:这会生成以下图像: 在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM