简体   繁体   中英

Seaborn distplot() won't display frequency in the y-axis

I am trying to display the weighted frequency in the y-axis of a seaborn.distplot() graph, but it keeps displaying the density (which is the default in distplot() )

I read the documentation and also many similar questions here in Stack.

The common answer is to set norm_hist=False and also to assign the weights in a bumpy array as in a standard histogram. However, it keeps showing the density and not the probability/frequency of each bin.

My code is

plt.figure(figsize=(10, 4))
plt.xlim(-0.145,0.145)
plt.axvline(0, color='grey')
data = df['col1']

x = np.random.normal(data.mean(), scale=data.std(), size=(100000))
normal_dist =sns.distplot(x, hist=False,color="red",label="Gaussian")

data_viz = sns.distplot(data,color="blue", bins=31,label="data", norm_hist=False)

# I also tried adding the weights inside the argument
#hist_kws={'weights': np.ones(len(data))/len(data)})

plt.legend(bbox_to_anchor=(1, 1), loc=1)

And I keep receiving this output:

在此处输入图像描述

Does anyone have an idea of what could be the problem here?

Thanks!

[EDIT]: The problem is that the y-axis is showing the kde values and not those from the weighted histogram. If I set kde=False then I can display the frequency in the y-axis. However, I still want to keep the kde , so I am not considering that option.

Keeping the kde and the frequency/count in one y-axis in one plot will not work because they have different scales. So it might be better to create a plot with 2 axis with each showing the kde and histogram separately. From documentation norm_hist If True, the histogram height shows a density rather than a count. **This is implied if a KDE or fitted density is plotted**. If True, the histogram height shows a density rather than a count. **This is implied if a KDE or fitted density is plotted**.

versusnja in https://github.com/mwaskom/seaborn/issues/479 has a workaround:

# Plot hist without kde.
# Create another Y axis.
# Plot kde without hist on the second Y axis.
# Remove Y ticks from the second axis.

first_ax  = sns.distplot(data, kde=False)
second_ax = ax.twinx()
sns.distplot(data, ax=second_ax, kde=True, hist=False)
second_ax.set_yticks([])

If you need this just for visualization it should be good enough.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM