简体   繁体   中英

Show Count and percentage labels for grouped bar chart python


I would like to add count and percentage labels to a grouped bar chart, but I haven't been able to figure it out.
I've seen examples for count or percentage for single bars, but not for grouped bars.

the data looks something like this (not the real numbers):
 age_group Mis surv unk death total surv_pct death_pct 0 0-9 1 2 0 3 6 100.0 0.0 1 10-19 2 1 0 1 4 99.9 0.0 2 20-29 0 3 0 1 4 99.9 0.0 3 30-39 0 7 1 2 10 100.0 0.0 `4 40-49 0 5 0 1 6 99.7 0.3 5 50-59 0 6 0 4 10 99.3 0.3 6 60-69 0 7 1 4 12 98.0 2.0 7 70-79 1 8 2 5 16 92.0 8.0 8 80+ 0 10 0 7 17 81.0 19.0

And The chart looks something like this分组条形图

I created the chart with this code:

 ax = df.plot(y=['deaths', 'surv'], kind='barh', figsize=(20,9), rot=0, title= '\n\n surv and deaths by age group') ax.legend(['Deaths', 'Survivals']); ax.set_xlabel('\nCount'); ax.set_ylabel('Age Group\n');


How could I add count and percentage labels to the grouped bars? I would like it to look something like this chart
带有计数和百分比标签的分组条形图

Since nobody else has suggested anything, here is one way to approach it with your dataframe structure.

from matplotlib import pyplot as plt
import pandas as pd

df = pd.read_csv("test.txt", delim_whitespace=True)

cat = ['death', 'surv']

ax = df.plot(y=cat,
             kind='barh',
             figsize=(20, 9),
             rot=0,
             title= '\n\n surv and deaths by age group')

#making space for the annotation
xmin, xmax = ax.get_xlim()
ax.set_xlim(xmin, 1.05 * xmax)

#connecting bar series with df columns
for cont, col in zip(ax.containers, cat):
    #connecting each bar of the series with its absolute and relative values 
    for rect, vals, perc in zip(cont.patches, df[col], df[col+"_pct"]):
        #annotating each bar
        ax.annotate(f"{vals} ({perc:.1f}%)", (rect.get_width(), rect.get_y() + rect.get_height() / 2.),
                     ha='left', va='center', fontsize=10, color='black', xytext=(3, 0),
                     textcoords='offset points')

ax.set_yticklabels(df.age_group)
ax.set_xlabel('\nCount')
ax.set_ylabel('Age Group\n')
ax.legend(['Deaths', 'Survivals'], loc="lower right")
plt.show()

Sample output: 在此处输入图像描述

If the percentages per category add up, one could also calculate the percentages on the fly. This would then not necessitate that the percentage columns have exactly the same name structure. Another problem is that the font size of the annotation, the scaling to make space for labeling the largest bar, and the distance between bar and annotation are not interactive and may need fine-tuning.
However, I am not fond of this mixing of pandas and matplotlib plotting functions. I had cases where the axis definition by pandas interfered with matplotlib, and datetime objects... well, let's not talk about that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM