[英]How to customize pandas barplot text annotations with related column values?
I am trying to build a stacked barplot with customized text on the annotations.我正在尝试使用注释上的自定义文本构建堆叠条形图。 The barplot is built using "morning_sales" and "afternoon_sales" entries for a list of store locations, and I would like to build a custom label for each box to show the height of the box and the a related value from another column (in this case, matching "morning_staff" with "morning_sales" and "afternoon_staff" with "afternoon_sales").条形图是使用商店位置列表的“morning_sales”和“afternoon_sales”条目构建的,我想为每个框构建一个自定义 label 以显示框的高度和来自另一列的相关值(在此情况下,将“morning_staff”与“morning_sales”匹配,将“afternoon_staff”与“afternoon_sales”匹配)。
My method works, but relies on knowing the order of the barplot rectangles... I'm concerned that things may fall apart if I do any re-ordering of the bars or related manipulations.我的方法有效,但依赖于知道条形图矩形的顺序......我担心如果我对条形图或相关操作进行任何重新排序,事情可能会分崩离析。 Can anyone recommend a better way to do this?谁能推荐一个更好的方法来做到这一点? Note that this is a "dummy" dataframe;请注意,这是一个“虚拟” dataframe; my true dataset is several hundred thousand rows.我的真实数据集是几十万行。
I am not sure if there is a way to extract text using the "handles, labels = ax.get_legend_handles_labels()" method?我不确定是否有办法使用“handles, labels = ax.get_legend_handles_labels()”方法提取文本?
Here is the code:这是代码:
import pandas as pd
data = {'location': ['Toronto', 'Vancouver', 'Edmonton', 'Calgary'],
'morning_staff': [3, 12, 25, 6],
'afternoon_staff': [2, 8, None, 8],
'morning_sales': [8000, 25000, 40000, 15000],
'afternoon_sales': [4000, 15000, None, 6000]
}
df = pd.DataFrame(data, columns = ['location', 'morning_staff', 'afternoon_staff', 'morning_sales', 'afternoon_sales' ])
# > Drop 'Calgary' from plot dataset and extract columns for plotting
df_plot = df.loc[df['location'] != 'Calgary', ['location', 'morning_sales', 'afternoon_sales']]
ax = df_plot.plot.bar(x='location', stacked=True, figsize=(8,6), colormap='tab10', fontsize=14)
# Add an annotation to each bar -> Showing staff required for sales
col_tags = ['morning_staff', 'afternoon_staff']
locations = df_plot['location'].tolist()
bar_labels = []
for col_tag in col_tags: # morning_sales, afternoon_sales
for location in locations:
idx = df.loc[df['location'] == location].index[0]
bar_label = df.loc[idx, col_tag].item()
bar_labels.append(bar_label)
rects = ax.patches
for rect, bar_label in zip(rects, bar_labels):
width, height = rect.get_width(), rect.get_height()
if ((height != 0) & (bar_label != np.nan)) :
x, y = rect.get_xy()
text = f'{int(bar_label)}: {int(height)}'
ax.text(x+width/2,
y+height/2,
text,
horizontalalignment='center',
verticalalignment='center',
fontsize=12)
import pandas as pd
data = {'location': ['Toronto', 'Vancouver', 'Edmonton', 'Calgary'],
'morning_staff': [3, 12, 25, 6],
'afternoon_staff': [2, 8, None, 8],
'morning_sales': [8000, 25000, 40000, 15000],
'afternoon_sales': [4000, 15000, None, 6000]
}
df=pd.DataFrame.from_dict(data)
df.set_index('location',inplace=True)
df['afternoon_staff']=df['afternoon_staff'].astype('Int64')
print(df)
df_plot=df.iloc[:-1,:]#skip the last row Calgary using indexing
df_plot.iloc[:,2:].plot(kind='bar',stacked=True,,colormap='tab10')
for i in range(len(df_plot)):
morning_lable=str(df_plot['morning_staff'][i])+':'+str(df_plot['morning_sales'][i])
afternoon_lable=str(df_plot['afternoon_staff'][i])+':'+str(df_plot['afternoon_sales'][i])
plt.annotate(morning_lable,(i-0.2,df_plot['morning_sales'][i]/2))
plt.annotate(afternoon_lable,(i-0.2,df_plot['morning_sales'][i]+df_plot['afternoon_sales'][i]/2))
plt.tight_layout()
Output: Output:
morning_staff afternoon_staff morning_sales afternoon_sales
location
Toronto 3 2 8000 4000.0
Vancouver 12 8 25000 15000.0
Edmonton 25 <NA> 40000 NaN
Calgary 6 8 15000 6000.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.