在Matplotlib中繪制多個直方圖 - 顏色或並排條形圖

Question

問題：在Matplotlib中繪制多個直方圖時，我無法區分繪圖與另一個繪圖

圖像問題：** **次要問題：部分左側標簽“計數”不在圖像范圍內。 為什么？

描述

我想繪制3個不同組的直方圖。 每組都是一個0和1的數組。 我想要每個的直方圖，所以我可以檢測數據集上的不平衡。

我讓它們分開繪制，但我想要一起繪制它們的圖形。

可以並排顯示不同的圖形，或者我甚至用谷歌搜索將其繪制為3D，但我不知道在圖形上“閱讀”或“查看”並理解它是多么容易。

現在，我想在同一圖形的每一側繪制[train]，[validation]和[test]條形圖，如下所示：

PS：我的谷歌搜索沒有返回任何可以理解的代碼。 此外，我想如果有人會檢查我是否對我的代碼做了任何瘋狂。

非常感謝！

代碼：

def generate_histogram_from_array_of_labels(Y=[], labels=[], xLabel="Class/Label", yLabel="Count", title="Histogram of Trainset"):
    plt.figure()
    plt.clf()

    colors = ["b", "r", "m", "w", "k", "g", "c", "y"]

    information = []
    for index in xrange(0, len(Y)):
        y = Y[index]

        if index > len(colors):
            color = colors[0]
        else:
            color = colors[index]

        if labels is None:
            label = "?"
        else:
            if index < len(labels):
                label = labels[index]
            else:
                label = "?"

        unique, counts = np.unique(y, return_counts=True)
        unique_count = np.empty(shape=(unique.shape[0], 2), dtype=np.uint32)

        for x in xrange(0, unique.shape[0]):
            unique_count[x, 0] = unique[x]
            unique_count[x, 1] = counts[x]

        information.append(unique_count)

        # the histogram of the data
        n, bins, patches = plt.hist(y, unique.shape[0], normed=False, facecolor=color, alpha=0.75, range=[np.min(unique), np.max(unique) + 1], label=label)

    xticks_pos = [0.5 * patch.get_width() + patch.get_xy()[0] for patch in patches]

    plt.xticks(xticks_pos, unique)

    plt.xlabel(xLabel)
    plt.ylabel(yLabel)
    plt.title(title)
    plt.grid(True)
    plt.legend()
    # plt.show()

    string_of_graphic_image = cStringIO.StringIO()

    plt.savefig(string_of_graphic_image, format='png')
    string_of_graphic_image.seek(0)

    return base64.b64encode(string_of_graphic_image.read()), information

編輯

在哈希碼的答案之后，這個新代碼：

def generate_histogram_from_array_of_labels(Y=[], labels=[], xLabel="Class/Label", yLabel="Count", title="Histogram of Trainset"):
    plt.figure()
    plt.clf()

    colors = ["b", "r", "m", "w", "k", "g", "c", "y"]
    to_use_colors = []
    information = []


    for index in xrange(0, len(Y)):
        y = Y[index]

        if index > len(colors):
            to_use_colors.append(colors[0])
        else:
            to_use_colors.append(colors[index])

        unique, counts = np.unique(y, return_counts=True)
        unique_count = np.empty(shape=(unique.shape[0], 2), dtype=np.uint32)

        for x in xrange(0, unique.shape[0]):
            unique_count[x, 0] = unique[x]
            unique_count[x, 1] = counts[x]

        information.append(unique_count)

    unique, counts = np.unique(Y[0], return_counts=True)
    histrange = [np.min(unique), np.max(unique) + 1]
    # the histogram of the data
    n, bins, patches = plt.hist(Y, 1000, normed=False, alpha=0.75, range=histrange, label=labels)


    #xticks_pos = [0.5 * patch.get_width() + patch.get_xy()[0] for patch in patches]

    #plt.xticks(xticks_pos, unique)

    plt.xlabel(xLabel)
    plt.ylabel(yLabel)
    plt.title(title)
    plt.grid(True)
    plt.legend()

產生這個：

- 新編輯：

def generate_histogram_from_array_of_labels(Y=[], labels=[], xLabel="Class/Label", yLabel="Count", title="Histogram of Trainset"):
    plt.figure()
    plt.clf()

    information = []

    for index in xrange(0, len(Y)):
        y = Y[index]

        unique, counts = np.unique(y, return_counts=True)
        unique_count = np.empty(shape=(unique.shape[0], 2), dtype=np.uint32)

        for x in xrange(0, unique.shape[0]):
            unique_count[x, 0] = unique[x]
            unique_count[x, 1] = counts[x]

        information.append(unique_count)

    n, bins, patches = plt.hist(Y, normed=False, alpha=0.75, label=labels)

    plt.xticks((0.25, 0.75), (0, 1))

    plt.xlabel(xLabel)
    plt.ylabel(yLabel)
    plt.title(title)
    plt.grid(True)
    plt.legend()

現在正在工作，但是，左側的標簽有點出界，我想更好地使酒吧居中......我怎么能這樣做？

結果：

Answer 1

我試過了，想出了這個。 您可以在代碼中更改xticks位置。 簡單地說，你要做的就是將一個元組傳遞給plt.hist ，不能更簡單吧！ 所以假設你有兩個0和1的列表，所以你要做的是 -

a = np.random.randint(2, size=1000)
b = np.random.randint(2, size=1000)
plt.hist((a, b), 2, label = ("data1", "data2"))
plt.legend()
plt.xticks((0.25, 0.75), (0, 1))

我試圖運行的確切代碼（在將箱數改為2之后） -

a = np.random.randint(2, size=1000)
b = np.random.randint(2, size=1000)
y = [a, b]
labels = ["data1", "data2"]
generate_histogram_from_array_of_labels(Y = y, labels = labels)

我得到了同樣的結果......

Answer 2

如果您的數據集長度相等，您可以使用pandas輕松完成此操作。 所以假設你有

import numpy

N = 1000
train, validation, test = [numpy.random.randint(2, size=N) for _ in range(3)]
Y = [train, validation, test]

你可以干脆做

import pandas

df = pandas.DataFrame(list(zip(*Y)), columns=['Train', 'Validation', 'Test'])
df.apply(pandas.value_counts).plot.bar()

這導致了這個情節：

如果你也import seaborn ，它看起來更好一點：

在Matplotlib中繪制多個直方圖 - 顏色或並排條形圖

問題描述

2 個解決方案

解決方案1
7 已采納 2016-06-29 18:19:17

解決方案2
1 2016-07-03 15:41:28

在Matplotlib中繪制多個直方圖 - 顏色或並排條形圖

問題描述

2 個解決方案

解決方案1 7 已采納 2016-06-29 18:19:17

解決方案2 1 2016-07-03 15:41:28

解決方案1
7 已采納 2016-06-29 18:19:17

解決方案2
1 2016-07-03 15:41:28