简体   繁体   中英

Seaborn how to add number of samples per HUE in sns.catplot

I have a catplot drawing using:

s = sns.catplot(x="type", y="val", hue="Condition", kind='box', data=df)

在此处输入图像描述

However, the size of "Condition" per hue is not equal: The blue has n=8 samples, and The green has n=11 samples.

What is the best way to add this info to the graph?

This is essentially the same solution as an earlier answer of mine , which I simplified a bit since:

df = sns.load_dataset('tips')
x_col='day'
y_col='total_bill'
order=['Thur','Fri','Sat','Sun']
hue_col='smoker'
hue_order=['Yes','No']
width=0.8


g = sns.catplot(kind="box", x=x_col, y=y_col, order=order, hue=hue_col, hue_order=hue_order, data=df)
ax = g.axes[0,0]

# get the offsets used by boxplot when hue-nesting is used
# https://github.com/mwaskom/seaborn/blob/c73055b2a9d9830c6fbbace07127c370389d04dd/seaborn/categorical.py#L367
n_levels = len(df[hue_col].unique())
each_width = width / n_levels
offsets = np.linspace(0, width - each_width, n_levels)
offsets -= offsets.mean()

pos = [x+o for x in np.arange(len(order)) for o in offsets]

counts = df.groupby([x_col,hue_col])[y_col].size()
counts = counts.reindex(pd.MultiIndex.from_product([order,hue_order]))
medians = df.groupby([x_col,hue_col])[y_col].median()
medians = medians.reindex(pd.MultiIndex.from_product([order,hue_order]))

for p,n,m in zip(pos,counts,medians):
    if not np.isnan(m):
        ax.annotate('N={:.0f}'.format(n), xy=(p, m), xycoords='data', ha='center', va='bottom')

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM