簡體   English   中英

如何在帶有熊貓的條形圖中繪制堆疊的一維和一個法線?

[英]How do I plot one dimension as stacked and one normal in a bar graph with pandas?

我想繪制一個使用熊貓的條形圖,其中包含兩個分類變量和5個數字列。 我想先按一個分類變量分組,然后將總和顯示為分組的條。 我還想按第二個類別變量分組,並讓每個欄將第二個類別顯示為堆疊欄。

像我的樣例數據幀可以構造如下:

import pandas as pd
l=100
df = pd.DataFrame({'op1': [random.randint(0,1) for x in range(l)], 
                    'op2': [random.randint(0,1) for x in range(l)], 
                    'op3': [random.randint(0,1) for x in range(l)], 
                    'op4': [random.randint(0,1) for x in range(l)], 
                    'op5': [random.randint(0,1) for x in range(l)],
                    'cat': random.choices(list('abcde'), k=l),
                    'gender': random.choices(list('mf-'), k=l)})
df.head()

  cat gender  op1  op2  op3  op4  op5
0   d      m    1    1    1    1    1
1   a      m    1    1    0    0    1
2   b      -    1    0    1    0    1
3   c      m    0    1    0    0    0
4   b      -    0    0    1    1    0
5   c      f    1    1    1    1    1
6   a      -    1    1    0    1    0
7   d      f    1    0    1    0    1
8   d      m    1    1    0    1    0
9   b      -    1    0    1    0    0

我可以很容易地產生分組的條: df.groupby('cat')[['op%s' % i for i in range(1,6)]].sum().plot.bar()

但是,如何獲得每個欄來顯示性別細分?

受vbox指向我的線程的啟發,我使用了一系列子圖來實現它,然后對顏色進行處理。 這很繁瑣,如果有人想將它與更多可變的數據集一起使用,他們將需要解決一些問題,但如果有幫助,請在此處發布。

import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
import random

l=100
df = pd.DataFrame({'op1': [random.randint(0,1) for x in range(l)], 
                    'op2': [random.randint(0,1) for x in range(l)], 
                    'op3': [random.randint(0,1) for x in range(l)], 
                    'op4': [random.randint(0,1) for x in range(l)], 
                    'op5': [random.randint(0,1) for x in range(l)],
                    'cat': random.choices(list('abcde'), k=l),
                   'gender': random.choices(list('mf'), k=l)})

# grab the colors in the current setup (could just use a new cycle instead)
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

values = df['cat'].unique()
l = len(values)

# make one subplot for every possible value
fig, axes = plt.subplots(1, l, sharey=True)

for i, value in enumerate(values):
    ax = axes[i]

    # make a dataset that includes gender and all options, then change orientation
    df2 = df[df['cat'] == value][['gender', 'op1', 'op2', 'op3', 'op4', 'op5']].groupby('gender').sum().transpose()

    # do the stacked plot. 
    # Note this has all M's one color, F's another
    # but we want each bar to have its own colour scheme
    df2.plot.bar(stacked=True, width=1, ax=ax, legend=False)

    # kludge to change bar colors
    # Note: this won't work if one gender is not present
    # or if there is a 3rd option for gender, as there is in the sample data
    # for this example, I've changed gender to just be m/f
    bars = [rect for rect in ax.get_children() if isinstance(rect, mpl.patches.Rectangle)]
    for c, b in enumerate(bars[:len(df2)*2]):
        b.set_color(colors[c%len(df2)])
        if c >= len(df2):
            b.set_alpha(0.5)

    ax.spines["top"].set_visible(False)   
    ax.spines["bottom"].set_color('grey')
    ax.spines["right"].set_visible(False)  
    ax.spines["left"].set_visible(False) 
    ax.set_xticks([])
    ax.set_xlabel(value, rotation=45)

輸出看起來像什么

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM