I want to add the number of observations to Seaborn barplots. I created a barplot with four bars that represent percentages on the y axis. I want to add a label on each bar showing the number of observations.
In my code, the first block creates the barplot.
I created the second two blocks of code from examples that I found elsewhere. I get an error message pointing to the row beginning with "medians," and the message says: AttributeError: 'float' object has no attribute 'values'
sns.set_style("whitegrid")
ax = sns.barplot(x=barplot_x, y="trump_margin_pct",
data=mean_analysis)
sns.palplot(sns.diverging_palette(240, 0))
ax.set(xlabel='Strength of Candidate Support', ylabel='Average Trump
Margin of Victory/(Loss) (in %)')
ax.set_title('Average Strength of Candidate Support Across Groups of
Counties, 2016')
# Calculate number of obs per group & median to position labels
medians = mean_analysis['trump_margin_pct'].median().values
nobs = mean_analysis['trump_margin_pct'].value_counts().values
nobs = [str(x) for x in nobs.tolist()]
nobs = ["n: " + i for i in nobs]
# Add it to the plot
pos = range(len(nobs))
for tick,label in zip(pos,ax.get_xticklabels()):
ax.text(pos[tick], medians[tick] + 0.03, nobs[tick],
horizontalalignment='center', size='x-small', color='w',
weight='semibold')
Your approach is almost right. However, you calculate the median and the number of observations over the whole data mean_analysis['trump_margin_pct']
and not over groups. This causes your error. You can use groupby
to calculate over groups.
Median:
Simply add groupby
to calculate your median.
medians = mean_analysis.groupby(['barplot_x'])['trump_margin_pct'].median().values
Number of obs:
For the number of obervations you have to calculate the aggregated value counts that are grouped. This is how you can do this.
nobs = mean_analysis.groupby(['barplot_x'])['trump_margin_pct'].agg(['count'])
nobs = ["n: " + str(i) for s in nobs.values for i in s]
Example:
I used some dummy data to recreate your example.
import seaborn as sns
sns.set_style("whitegrid")
tips = sns.load_dataset("tips")
ax = sns.barplot(x="day", y="total_bill", data=tips)
ax.set(xlabel='Strength of Candidate Support', ylabel='Average Trump Margin of Victory/(Loss) (in %)')
ax.set_title('Average Strength of Candidate Support Across Groups of Counties, 2016')
medians = tips.groupby(['day'])['total_bill'].median().values
nobs = tips.groupby(['day'])['total_bill'].agg(['count'])
nobs = ["n: " + str(i) for s in nobs.values for i in s]
pos = range(len(nobs))
for tick,label in zip(pos,ax.get_xticklabels()):
ax.text(pos[tick], medians[tick] + 0.03, nobs[tick], horizontalalignment='center', size='x-small', color='w', weight='semibold')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.