Plotting categorical data settings over time in Python

Question

I'm having trouble generating a plot comprised of various settings over time using matplotlib. I would like to present the appearance of a stacked horizontal bar chart, though the data is categorical.

import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Setting1':['A','A','C'],'Setting2':['B','B','B'],'Setting3':
['D','D','C'],'TimeStr':['2021-06-12 13:00:00','2021-06-12 13:00:01','2021-06-12 13:00:02']})
df['TimeStr'] = pd.to_datetime(df['TimeStr'])
fig,ax = plt.subplots()
plt.barh(df['Setting1'],df['TimeStr'])
plt.barh(df['Setting2'],df['TimeStr'])
plt.barh(df['Setting3'],df['TimeStr'])
plt.show()

The desired output would look something like this:

         |-------------------------
Setting3 |         D       |  C   |
         |-------------------------
         |-------------------------
Setting2 |           B            |
         |-------------------------
         |-------------------------
Setting1 |       A         | C    |
         |-------------------------
         |____________________________
                     Time

Currently my y axis is getting set to A, B, C and D rather than the settings variables. Is there a way to achieve this using matplotlib?

Answer 1

Lots of ways to do this, here is one implementation. Your dataframe isn't really represented in a way that is conducive to bar plots. Usually with bar plots you have one column per SET of bars. Then it is easy to stack bars by indicating the left position they should start at. You may need to use a groupby method to get the counts for your categorical variables depending on how your data are structured. Here is a really nice groupby tutorial to help with that. If you need more specific time measurements, you could use pd.TimeDelta as your dataframe values. Here is a nice tutorial on stacked bar plots with matplotlib.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib as mpl
sns.set(style='white') #define color set

#make dataframe
df = pd.DataFrame({
    'Setting' : ['S1', 'S2', 'S3'], 
    'A' : [0, 0, 1],
    'B' : [0, 3, 0],
    'C' : [1, 0, 1],
    'D' : [2, 0, 0]
    })

fig, ax = plt.subplots(figsize=(12,5))
bars = ['A', 'B', 'C', 'D'] #order we will plot the bars
left = 0                    #will indicate left starting points for next set of bars

#for each index in bars (A, B, C, D), e.g. for each column
#make a horizontal bar plot with the left part of the plot
#starting at LEFT variable, increase the left variable by the current set of bars
for i in range(len(bars)):
    ax.barh(
        y = np.arange(len(df['A'])) / 1.2, #divide by 1.2 to scale down the y axis
        width = df[bars[i]].values, 
        left = left, 
        label = bars[i], 
        height = 0.5
    )
    left += df[bars[i]]

#Add y_axis tick labels, FixedLocator, FixedFormatter, legend, x_label
#hide spines
labels = ['Setting 1', 'Setting 2', 'Setting 3']
ax.yaxis.set_major_locator(mpl.ticker.FixedLocator(np.arange(len(df['A'])) / 1.2))
ax.yaxis.set_major_formatter(mpl.ticker.FixedFormatter(labels))
legend = ax.legend(edgecolor='w', fontsize=14, ncol=4, bbox_to_anchor=(0.25, 1), loc='lower left')
spines = [ax.spines[x].set_visible(False) for x in ['top','right','bottom']]
x_label = ax.set_xlabel('Time for each setting')
ylim = ax.set_ylim(-0.3, 1.95)

Plotting categorical data settings over time in Python

Question

1 answers

solution1
0 2021-07-12 22:29:29

Plotting categorical data settings over time in Python

Question

1 answers

solution1 0 2021-07-12 22:29:29

solution1
0 2021-07-12 22:29:29