简体   繁体   English

Seaborn 条形图:条形图的异构组

[英]Seaborn Barplot: Heterogeneous groups of bars

I'm using seaborn to plot the results of different algorithms.我正在使用 seaborn 到 plot 不同算法的结果。 I want to distinguish both the different algorithms as well as their classification ("group").我想区分不同的算法以及它们的分类(“组”)。 The problem is that not all algorithms are in all groups, so when I use group as hue , I get a lot of blank space:问题是并非所有算法都在所有组中,所以当我使用 group 作为hue时,我会得到很多空白:

import seaborn as sns
group = ['Simple', 'Simple', 'Complex', 'Complex', 'Cool']
alg = ['Alg 1', 'Alg 2', 'Alg 3', 'Alg 4', 'Alg 2']
results = [i+1 for i in range(len(group))]
sns.barplot(group, results, hue=alg)

条形图

As you can see, seaborn makes space for bars from all algorithms to be in all groups, leading to lots of blank space.如您所见,seaborn 为来自所有算法的条形留出了空间,使其位于所有组中,从而导致大量空白。 How can I avoid that?我怎样才能避免这种情况? I do want to show the different groups on the x-axis and distinguish the different algorithms by color/style.我确实想在 x 轴上显示不同的组,并通过颜色/样式区分不同的算法。 Algorithms my occur in multiple but not all groups.算法可能出现在多个但不是所有组中。 But I just want space for 2 bars in "Simple" and "Complex" and just for 1 in "Cool".但我只想要“简单”和“复杂”中的 2 个小节的空间,而“酷”中的 1 个小节。 Any solutions with pure matplotlib are also welcome;也欢迎任何纯matplotlib的解决方案; it doesn't need to be seaborn.它不需要是 seaborn。 I'd like to keep the seaborn color palette though.不过,我想保留 seaborn 调色板。

There doesn't seem to be a standard way to create this type of grouped barplot.似乎没有创建这种类型的分组条形图的标准方法。 The following code creates a list of positions for the bars, their colors, and lists for the labels and their positions.以下代码为条形创建一个位置列表,它们的 colors,以及标签及其位置的列表。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib.patches import Patch

group = ['Simple', 'Simple', 'Complex', 'Complex', 'Cool']
alg = ['Alg 1', 'Alg 2', 'Alg 3', 'Alg 4', 'Alg 2']
colors = plt.cm.tab10.colors
alg_cat = pd.Categorical(alg)
alg_colors = [colors[c] for c in alg_cat.codes]

results = [i + 1 for i in range(len(group))]

dist_groups = 0.4 # distance between successive groups
pos = (np.array([0] + [g1 != g2 for g1, g2 in zip(group[:-1], group[1:])]) * dist_groups + 1).cumsum()
labels = [g1 for g1, g2 in zip(group[:-1], group[1:]) if g1 != g2] + group[-1:]
label_pos = [sum([p for g, p in zip(group, pos) if g == label]) / len([1 for g in group if g == label])
             for label in labels]
plt.bar(pos, results, color=alg_colors)
plt.xticks(label_pos, labels)
handles = [Patch(color=colors[c], label=lab) for c, lab in enumerate(alg_cat.categories)]
plt.legend(handles=handles)
plt.show()

结果图

While one could handle this case completely within matplotlib and numpy , I've solved it via pandas .虽然可以在matplotlibnumpy中完全处理这种情况,但我已经通过pandas解决了它。 The reason being that you need to figure out a way to do the categorical groupings correctly, and this is one of the main advantages in pandas .原因是您需要找出一种正确进行分类分组的方法,这是pandas的主要优势之一。

So what I did, essentially, is create a DataFrame from your data, which is then grouped by - obviously - the group category.所以我所做的,基本上,是从您的数据中创建一个 DataFrame,然后按 - 显然 - group类别进行分组。 While iterating over each enumerated category with index i=0,1,2,.. , we create a set of ax.bar() plots, each confined to the interval [i-0.5, i+0,5] .在遍历索引为i=0,1,2,..的每个枚举类别时,我们创建了一组ax.bar()图,每个图都限制在区间[i-0.5, i+0,5]内。 The colours are taken from the seaborn colormap, as requested, and are then also used in the end to create a custom legend.根据要求,颜色取自 seaborn 颜色图,最后也用于创建自定义图例。

import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import pandas as pd
import numpy as np

group = ['Simple', 'Simple', 'Complex', 'Complex', 'Cool']
alg = ['Alg 1', 'Alg 2', 'Alg 3', 'Alg 4', 'Alg 2']
results = [i+1 for i in range(len(group))]

df = pd.DataFrame({'group':group, 'alg':alg, 'results':results})

## this next line is only required if you have a specific order in mind;
#  else, the .groupby() method will sort alphabetically!
df['group'] = pd.Categorical(df['group'], ["Simple", "Complex", "Cool"])

## choose the desired seaborn color palette here:
palette = sns.color_palette("Paired", len(alg))
labels, levels = pd.factorize(df['alg'])
df['color'] = [palette[l] for l in labels]

gdf = df.groupby('group')

fig,ax=plt.subplots(figsize=(5,3))  

xt = []
xtl = []

min_width = 1/max([len(item) for (key,item) in gdf])


for i,(key,item) in enumerate(gdf):
    xt.append(i)
    xtl.append(key)

    ## for each enumerated group, we need to create the proper x-scale placement
    #  such that each bar plot is centered around " i "
    #  i.e. two bars near " i = 0 " will be placed at [-0.25, 0.25] with widths of 0.5
    #  so that they won't collide with bars near " i = 1 "
    #  which themselves are placed at [0.75 1.25] 
    rel = np.linspace(0,1,len(item)+1)[:-1]
    rel -= rel.mean() 
    rel +=i

    w = 1/(len(item))
    ## note that the entire interval width (i.e. from i-0.5 to i+0.5) will be filled with the bars,
    #  meaning that the individual bar widths will vary depending on the number of bars.
    #  either adjust the bar width like this to add some whitespace:
    # w *= 0.9 
    ## or alternatively, you could use a fixed width instead:
    # w = 0.4
    ## or, by pre-evaluating the minimal required bar width:
    # w = min_width

    ax.bar(rel,item['results'].values,alpha=1,width=w,color=item['color'])

leg = []
for i,l in enumerate(levels):
    p = mpatches.Patch(color=palette[i], label=l)
    leg.append(p)
ax.legend(handles=leg)

ax.set_xticks(xt)
ax.set_xticklabels(xtl)
ax.grid()
plt.show()

The result (using sns.color_palette("Paired") and w=1/len(item) ) then looks like this:结果(使用sns.color_palette("Paired")w=1/len(item) )如下所示:

漂亮的条形图

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM