Pandas and matplotlib stacked bar chart with major and minor x-ticks grouped together

Question

I have the following data:

id, approach, outcome
a1, approach1, outcome1
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach1, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach2, outcome1
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach3, outcome2
a2, approach3, outcome2
a2, approach3, outcome1
a2, approach3, outcome2
a2, approach3, outcome1

I found the following chart from another user which is exactly what I am looking to accomplish:

But instead of fruits we have ids and instead of years we have approaches.

Here is what I have done so far:

df = pandas.read_csv("test.txt", sep=r',\s+', engine = "python")
fig, ax = plt.subplots(1, 1, figsize=(5.5, 4))

data = df[df.approach == "approach1"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=0.6, color=["g", "r"], stacked=True, ax=ax)

data = df[df.approach == "approach2"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=-0.6, color=["g", "r"], stacked=True, ax=ax)

# "Activate" minor ticks
ax.minorticks_on()

rects_locs = []
p = 0
for patch in ax.patches:
    rects_locs.append(patch.get_x() + patch.get_width())
    # p += 0.01

# Set minor ticks there
ax.set_xticks(rects_locs, minor = True)

# Labels for the rectangles
new_ticks = ["Approach1"] * 10 + ["Approach2"] * 10

# Set the labels
from matplotlib import ticker
ax.xaxis.set_minor_formatter(ticker.FixedFormatter(new_ticks))  #add the custom ticks

# Move the category label further from x-axis
ax.tick_params(axis='x', which='major', pad=15)

# Remove minor ticks where not necessary
ax.tick_params(axis='x',which='both', top='off')
ax.tick_params(axis='y',which='both', left='off', right = 'off')
plt.xticks(rotation=0)

But the output is not nice:

So basically I want to have id as the major x-tick (so there should be 2 such x values) and then for each id there should be 3 grouped stacked bars (approach1, approach2, approach3).

Answer 1

Well, I'm not proud of it. But it works. Hopefully somebody more knowledgeable will come along with a better solution.

I start by setting up your data:

import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import numpy as np
import pandas as pd

data = np.array([
'id', 'approach', 'outcome',
'a1', 'approach1', 'outcome1',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1'])

data = data.reshape(data.size // 3, 3)

df = pd.DataFrame(data[1:], columns=data[0])

Next I count up all the occurrences of "outcome1" and "outcome2" for each approach and id. (I'm sure this could be done directly in pandas, but I'm a bit of a pandas novice):

dict = {}

for id in 'a1', 'a2':
    dict[id] = {}
    for approach in 'approach1', 'approach2', 'approach3':
        dict[id][approach] = {}
        for outcome in 'outcome1', 'outcome2':
            dict[id][approach][outcome] = ((df['id'] == id)
                                         & (df['approach'] == approach)
                                         & (df['outcome'] == outcome)).sum()

plot_data = pd.DataFrame(dict)

Now all that is left is to do the plotting.

fig, ax = plt.subplots(1, 1)

i = 0
for id in 'a1', 'a2':
    for approach in 'approach1', 'approach2', 'approach3':
        ax.bar(i, plot_data[id][approach]["outcome1"], color='g')
        ax.bar(i, plot_data[id][approach]["outcome2"],
               bottom=plot_data[id][approach]["outcome1"], color='r')
        i += 1
    i+=1

ax.set_xticklabels(['', 'approach1', 'approach2', 'approach3', '',
                    'approach1', 'approach2', 'approach3'], rotation=45)

custom_lines = [Line2D([0], [0], color='g', lw=4),
                Line2D([0], [0], color='r', lw=4)]

ax.legend(custom_lines, ['Outcome 1', 'Outcome 2'])

Pandas and matplotlib stacked bar chart with major and minor x-ticks grouped together

Question

1 answers

solution1
1 ACCPTED 2019-02-13 15:45:02

Pandas and matplotlib stacked bar chart with major and minor x-ticks grouped together

Question

1 answers

solution1 1 ACCPTED 2019-02-13 15:45:02

solution1
1 ACCPTED 2019-02-13 15:45:02