简体   繁体   中英

Grouped Bar-Chart with customized DateTime Index using pandas and Matplotlib

I'd like to create a grouped bar chart that shows a customized Date-Time Index - just showing Month and year instead of the full dates. I want the bars to be grouped and not stacked.

I assumed pandas could handle this easily, using:

import pandas as pd
import matplotlib.pylab as plt
import matplotlib.dates as mdates

testdata = pd.DataFrame({"A": [1, 2, 3]
                       ,"B": [2, 3, 1]
                       , "C": [2, 3, 1]}  
                       ,index=pd.to_datetime(pd.DatetimeIndex(
                            data=["2019-03-02", "2019-04-01","2019-05-01"])))
ax = testdata.plot.bar()

This creates the plot that I want, I'd just like to change to date into something more simple, like March 2019, April 2019, May 2019. 分组条形图,但 x 轴标签很烂

I assumed using a Custom Date Formatter would work, therefore I tried

ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))

But than my labels are gone completely. And this question implies that pandas and the DateFormatter have a bit of a difficult relationship. Therefore I tried to do it with Matplotlib basics:

fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

Now the date representation is as expected (although the whitespace could be reduced), but the data overlap, which doesn't help. 在此处输入图像描述

Defining a width and subtracting it from the x values (as suggested normally) won't help due to the DateTime-Index I use. I get an error that subtracting DatetimeIndes and float is unsupported.

fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index-width, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index+width, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

So now I'm running out of ideas and hope for input

Pandas barplots are categorical. So maybe you're overthinking this and just want to use the string you want to see as a category label on the axis as index?

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({"A": [1, 2, 3]
                       ,"B": [2, 3, 1]
                       , "C": [2, 3, 1]}  
                       ,index=pd.to_datetime(pd.DatetimeIndex(
                            data=["2019-03-02", "2019-04-01","2019-05-01"])))

df.index = [d.strftime("%b %Y") for d in df.index]
ax = df.plot.bar()
plt.show()

在此处输入图像描述

The reason ax.xaxis.set_major_locator(mdates.MonthLocator()) fails because under the hood, pandas plots the bars against range(len(df)) , then rename the ticks accordingly.

You can grab the xticklabels after you plot, and reformat it:

ax = testdata.plot.bar()

ticks = [tick.get_text() for tick in ax.get_xticklabels()]
ticks = pd.to_datetime(ticks).strftime('%b %Y')
ax.set_xticklabels(ticks)

which gives the same result as ImpotanceOfBeingErnest's:

在此处输入图像描述

Another, probably better way is to shift the bars of each columns. This works better when you have many columns and want to reduce the number of xticks.

fig, ax = plt.subplots()

# define the shift
shift = pd.to_timedelta('1D')

# modify the base of each columns, can do with a for loop
ax.bar(testdata.index + shift, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index - shift, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

Output:

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM