[英]Stacked barplot with two categorical variables from dataframe, Python
以下是将提供您正在寻找的图表的数据和代码。 完成的步骤是:
free/reduced
午餐的百分比。 这是针对每个教育组计算的,并存储在percs
totals
列具有每个教育组的计数,将用于识别标签的 position。 请注意,此计数仅适用于使用free/reduced
午餐选项的人free/reduced
午餐的人过滤,因为这就是您要寻找的 plot。groupby()
分组,然后是种族列和。 然后是unstacked()
。totals
和percs
,在每个堆叠条的顶部添加 label 以显示百分比。 我使用了 1 个小数位,但您可以根据需要调整它注意:我使用的是 python 3.8.8
和 matplotlib 3.3.4
。 如果您有 matplotlib 3.4.2
或更高版本,则可以使用 matplotlib 的bar_label() ,这可能会减少绘制文本的工作量。
希望这是您正在寻找的。
我的资料
>> df
parental edu. group race/ethnicity lunch
0 College edu Group A free/reduced
1 College edu Group A free/reduced
2 College edu Group A free/reduced
3 College edu Group B free/reduced
4 College edu Group B free/reduced
5 College edu Group B standard
6 College edu Group B standard
7 College edu Group A standard
8 College edu Group A standard
9 College edu Group A standard
10 High School Group A free/reduced
11 High School Group A free/reduced
12 High School Group B free/reduced
13 High School Group A standard
14 High School Group B standard
15 No edu Group B standard
16 No edu Group A standard
17 No edu Group B free/reduced
18 No edu Group A free/reduced
代码
percs=[] ##To store percentages
totals=[] ##To store totals
#Update the totals and percentages for each education-group
for ch in df['parental edu. group'].unique():
percs.append(round(len(df[(df['lunch']=='free/reduced') & (df['parental edu. group'] == ch)])/len(df[df['parental edu. group'] == ch])*100,1))
totals.append(len(df[(df['lunch']=='free/reduced') & (df['parental edu. group'] == ch)]))
# Group data, count the lunch column and unstack it
df=df[df['lunch']=='free/reduced'].groupby(['parental edu. group', 'race/ethnicity']).count().unstack('race/ethnicity')
#Drop the top level - lunch
df.columns = df.columns.droplevel()
#Plot the graph
fig, ax=plt.subplots(figsize=(8,6))
df.plot.bar(stacked=True, rot=0, ax=ax)
# Add labels to each stacked bar.
# Note that the position is found from totals list and text is from percs list
for i, total in enumerate(totals):
print(i, totals[i], percs[i]*100)
ax.text(i, total + 0.1, str(percs[i])+"%", ha='center', weight='bold')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.