简体   繁体   English

如何将数据标签添加到seaborn pointplot?

[英]How to add data labels to seaborn pointplot?

The code below creates a categorical plot with a pointplot on top of it, where the pointplot shows the mean and 95% confidence interval for each category.下面的代码创建了一个分类图,上面有一个点图,其中点图显示了每个类别的均值和 95% 置信区间。 I need to add the mean data label to the plot, and I can't figure out how to do it.我需要将平均数据标签添加到图中,但我不知道该怎么做。

FYI each category has thousands of points, so I don't want to label every datapoint, just the estimator=np.mean values in the point plot.仅供参考,每个类别都有数千个点,所以我不想标记每个数据点,只是点图中的estimator=np.mean值。 Is this possible??这可能吗??

I've created a sample dataset here so you can copy and paste the code and run it yourself.我在这里创建了一个示例数据集,因此您可以复制和粘贴代码并自己运行它。

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np

d = {'SurfaceVersion': ['v1', 'v1', 'v1', 'v2', 'v2', 'v2', 'v3', 'v3', 'v3'],
        'Error%': [.01, .03, .15, .28, .39, .01, .01, .06, .09]}

df_comb =  pd.DataFrame(data=d)

plotHeight = 10
plotAspect = 2
 
#create catplot with jitter per surface version:
ax = sns.catplot(data=df_comb, x='SurfaceVersion', y='Error%', jitter=True, legend=False, zorder=1, height=plotHeight, aspect=plotAspect)
ax = sns.pointplot(data=df_comb, x='SurfaceVersion', y='Error%', estimator=np.mean, ci=95, capsize=.1, errwidth=1, hue='SurfaceVersion', color='k',zorder=2, height=plotHeight, aspect=plotAspect, join=False)
ax.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1.0))
plt.gca().legend().set_title('')
plt.grid(color='grey', which='major', axis='y', linestyle='--')
plt.xlabel('Surface Version')
plt.ylabel('Error %')
plt.subplots_adjust(top=0.95, left=.05)
plt.suptitle('Error%')
plt.legend([],[], frameon=False)                #This is to get rid of the legend that pops up with the seaborn plot b/c it's buggy.
plt.axhline(y=0, color='r', linestyle='--')
plt.show()

You can pre-calculate the mean and add the labels in a loop.您可以预先计算平均值并在循环中添加标签。 Bear in mind that x-values are really just 0, 1, 2 as far as positioning is concerned.请记住,就定位而言,x 值实际上只是 0、1、2。

mean_df = df_comb.groupby("SurfaceVersion")[["Error%"]].mean()

for i, row in enumerate(mean_df.itertuples()):

    x_value, mean = row
    
    plt.annotate(
        round(mean, 2),               # label text
        (i, mean),                    # (x, y)
        textcoords="offset points",   
        xytext=(10, 0),               # (x, y) offset amount
        ha='left')

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM