简体   繁体   中英

How do I plot one dimension as stacked and one normal in a bar graph with pandas?

I would like to plot a bar graph, using pandas, that two categorical variables and 5 numeric columns. I would like to first group by one categorical variable and show the sum as grouped bars. I would also like to group by the second categorical variable, and have each bar show the second category as stacked bars.

A sample dataframe like mine can be constructed as follows:

import pandas as pd
l=100
df = pd.DataFrame({'op1': [random.randint(0,1) for x in range(l)], 
                    'op2': [random.randint(0,1) for x in range(l)], 
                    'op3': [random.randint(0,1) for x in range(l)], 
                    'op4': [random.randint(0,1) for x in range(l)], 
                    'op5': [random.randint(0,1) for x in range(l)],
                    'cat': random.choices(list('abcde'), k=l),
                    'gender': random.choices(list('mf-'), k=l)})
df.head()

  cat gender  op1  op2  op3  op4  op5
0   d      m    1    1    1    1    1
1   a      m    1    1    0    0    1
2   b      -    1    0    1    0    1
3   c      m    0    1    0    0    0
4   b      -    0    0    1    1    0
5   c      f    1    1    1    1    1
6   a      -    1    1    0    1    0
7   d      f    1    0    1    0    1
8   d      m    1    1    0    1    0
9   b      -    1    0    1    0    0

I can produce the grouped bar easily enough: df.groupby('cat')[['op%s' % i for i in range(1,6)]].sum().plot.bar()

But how can I get each bar to show the gender breakdown?

Inspired by the thread the vbox pointed me to, I implemented it using a series of subplots, and mucking around with the color. It is pretty kludgy, and if anyone wants to use this with a more variable dataset, they'll need to address some concerns, but posting here in case it is helpful.

import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
import random

l=100
df = pd.DataFrame({'op1': [random.randint(0,1) for x in range(l)], 
                    'op2': [random.randint(0,1) for x in range(l)], 
                    'op3': [random.randint(0,1) for x in range(l)], 
                    'op4': [random.randint(0,1) for x in range(l)], 
                    'op5': [random.randint(0,1) for x in range(l)],
                    'cat': random.choices(list('abcde'), k=l),
                   'gender': random.choices(list('mf'), k=l)})

# grab the colors in the current setup (could just use a new cycle instead)
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

values = df['cat'].unique()
l = len(values)

# make one subplot for every possible value
fig, axes = plt.subplots(1, l, sharey=True)

for i, value in enumerate(values):
    ax = axes[i]

    # make a dataset that includes gender and all options, then change orientation
    df2 = df[df['cat'] == value][['gender', 'op1', 'op2', 'op3', 'op4', 'op5']].groupby('gender').sum().transpose()

    # do the stacked plot. 
    # Note this has all M's one color, F's another
    # but we want each bar to have its own colour scheme
    df2.plot.bar(stacked=True, width=1, ax=ax, legend=False)

    # kludge to change bar colors
    # Note: this won't work if one gender is not present
    # or if there is a 3rd option for gender, as there is in the sample data
    # for this example, I've changed gender to just be m/f
    bars = [rect for rect in ax.get_children() if isinstance(rect, mpl.patches.Rectangle)]
    for c, b in enumerate(bars[:len(df2)*2]):
        b.set_color(colors[c%len(df2)])
        if c >= len(df2):
            b.set_alpha(0.5)

    ax.spines["top"].set_visible(False)   
    ax.spines["bottom"].set_color('grey')
    ax.spines["right"].set_visible(False)  
    ax.spines["left"].set_visible(False) 
    ax.set_xticks([])
    ax.set_xlabel(value, rotation=45)

输出看起来像什么

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM