如何繪制來自多個數據集的分組條形圖

Question

我正在瀏覽Think Stats ，我想直觀地比較多個數據集。 從書中的例子中可以看到，通過使用本書作者提供的模塊，可以為每個數據集生成不同顏色的交錯條形圖，如何在pyplot獲得相同的結果？

Answer 1

文檔中提供了一個精彩的示例/演示：

http://matplotlib.sourceforge.net/examples/api/barchart_demo.html

Answer 2

多次調用條形函數，每個系列一個。 您可以使用left參數控制條的左側位置，您可以使用它來防止重疊。

完全未經測試的代碼：

pyplot.bar( numpy.arange(10) * 2, data1, color = 'red' )
pyplot.bar( numpy.arange(10) * 2 + 1, data2, color = 'red' )

與繪制數據的位置相比，Data2將在右側繪制。

Answer 3

我不久前遇到過這個問題並創建了一個包裝器函數，該函數采用2D數組並自動創建一個多條形圖：

多類別條形圖

編碼：

import matplotlib.pyplot as plt
import matplotlib.cm as cm
import operator as o

import numpy as np

dpoints = np.array([['rosetta', '1mfq', 9.97],
           ['rosetta', '1gid', 27.31],
           ['rosetta', '1y26', 5.77],
           ['rnacomposer', '1mfq', 5.55],
           ['rnacomposer', '1gid', 37.74],
           ['rnacomposer', '1y26', 5.77],
           ['random', '1mfq', 10.32],
           ['random', '1gid', 31.46],
           ['random', '1y26', 18.16]])

fig = plt.figure()
ax = fig.add_subplot(111)

def barplot(ax, dpoints):
    '''
    Create a barchart for data across different categories with
    multiple conditions for each category.

    @param ax: The plotting axes from matplotlib.
    @param dpoints: The data set as an (n, 3) numpy array
    '''

    # Aggregate the conditions and the categories according to their
    # mean values
    conditions = [(c, np.mean(dpoints[dpoints[:,0] == c][:,2].astype(float))) 
                  for c in np.unique(dpoints[:,0])]
    categories = [(c, np.mean(dpoints[dpoints[:,1] == c][:,2].astype(float))) 
                  for c in np.unique(dpoints[:,1])]

    # sort the conditions, categories and data so that the bars in
    # the plot will be ordered by category and condition
    conditions = [c[0] for c in sorted(conditions, key=o.itemgetter(1))]
    categories = [c[0] for c in sorted(categories, key=o.itemgetter(1))]

    dpoints = np.array(sorted(dpoints, key=lambda x: categories.index(x[1])))

    # the space between each set of bars
    space = 0.3
    n = len(conditions)
    width = (1 - space) / (len(conditions))

    # Create a set of bars at each position
    for i,cond in enumerate(conditions):
        indeces = range(1, len(categories)+1)
        vals = dpoints[dpoints[:,0] == cond][:,2].astype(np.float)
        pos = [j - (1 - space) / 2. + i * width for j in indeces]
        ax.bar(pos, vals, width=width, label=cond, 
               color=cm.Accent(float(i) / n))

    # Set the x-axis tick labels to be equal to the categories
    ax.set_xticks(indeces)
    ax.set_xticklabels(categories)
    plt.setp(plt.xticks()[1], rotation=90)

    # Add the axis labels
    ax.set_ylabel("RMSD")
    ax.set_xlabel("Structure")

    # Add a legend
    handles, labels = ax.get_legend_handles_labels()
    ax.legend(handles[::-1], labels[::-1], loc='upper left')

barplot(ax, dpoints)
plt.show()

如果你對這個功能的作用及其背后的邏輯感興趣，這里有一個（無恥地自我推銷）鏈接到描述它的博客文章。

Answer 4

Matplotlib的交錯條形圖的示例代碼適用於任意實值x坐標（如@ db42所述）。

但是，如果您的x坐標是分類值（如鏈接問題中的詞典），則從分類x坐標到實際x坐標的轉換是麻煩且不必要的。

您可以使用matplotlib的api直接並排繪制兩個詞典。 繪制兩個相互偏移的條形圖的技巧是設置align=edge和正寬度（ +width ）以繪制一個條形圖，而繪制另一個條形圖的負寬度（ -width ）。

為繪制兩個字典而修改的示例代碼如下所示：

"""
========
Barchart
========

A bar plot with errorbars and height labels on individual bars
"""
import matplotlib.pyplot as plt

# Uncomment the following line if you use ipython notebook
# %matplotlib inline

width = 0.35       # the width of the bars

men_means = {'G1': 20, 'G2': 35, 'G3': 30, 'G4': 35, 'G5': 27}
men_std = {'G1': 2, 'G2': 3, 'G3': 4, 'G4': 1, 'G5': 2}

rects1 = plt.bar(men_means.keys(), men_means.values(), -width, align='edge',
                yerr=men_std.values(), color='r', label='Men')

women_means = {'G1': 25, 'G2': 32, 'G3': 34, 'G4': 20, 'G5': 25}
women_std = {'G1': 3, 'G2': 5, 'G3': 2, 'G4': 3, 'G5': 3}

rects2 = plt.bar(women_means.keys(), women_means.values(), +width, align='edge',
                yerr=women_std.values(), color='y', label='Women')

# add some text for labels, title and axes ticks
plt.xlabel('Groups')
plt.ylabel('Scores')
plt.title('Scores by group and gender')
plt.legend()

def autolabel(rects):
    """
    Attach a text label above each bar displaying its height
    """
    for rect in rects:
        height = rect.get_height()
        plt.text(rect.get_x() + rect.get_width()/2., 1.05*height,
                '%d' % int(height),
                ha='center', va='bottom')

autolabel(rects1)
autolabel(rects2)

plt.show()

結果：

如何繪制來自多個數據集的分組條形圖

問題描述

3 個解決方案

解決方案1
11 2011-07-15 19:44:28

解決方案2
8 已采納 2011-04-26 15:39:21

解決方案3
3 2014-09-14 20:35:15

解決方案4
2 2018-06-11 20:29:33

如何繪制來自多個數據集的分組條形圖

問題描述

3 個解決方案

解決方案1 11 2011-07-15 19:44:28

解決方案2 8 已采納 2011-04-26 15:39:21

解決方案3 3 2014-09-14 20:35:15

解決方案4 2 2018-06-11 20:29:33

解決方案1
11 2011-07-15 19:44:28

解決方案2
8 已采納 2011-04-26 15:39:21

解決方案3
3 2014-09-14 20:35:15

解決方案4
2 2018-06-11 20:29:33