[英]Python seaborn.distplot returning count instead of probability
I have a pandas
series x
:我有一个
pandas
系列x
:
0 -0.000069
1 -0.000059
2 -0.000025
3 -0.000021
4 -0.000021
...
1036 0.000032
1037 0.000033
1038 0.000052
1039 0.000055
1040 0.000092
Name: c, Length: 1041, dtype: float64
I would like to plot a probability density function with histogram, in which I used seaborn.distplot
:我想用直方图绘制概率密度函数,其中我使用了
seaborn.distplot
:
import matplotlib.pyplot as plt
import seaborn as sns
sns.distplot(x, hist=True, kde=True, bins=100,
hist_kws={'edgecolor':'black', 'color': 'r'},
kde_kws={'linewidth': 1, 'color': 'b'})
plt.xlim(-0.00002, 0.00002)
plt.ylim(ymin=0)
plt.xlabel("x")
plt.ylabel("probability")
plt.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
plt.show()
As a result, I get the following figure:结果,我得到了下图:
As shown, the vertical axis represents count, but instead I want (and expected from this code) probability.如图所示,纵轴代表计数,但我想要(并且从这段代码中得到预期)概率。 I am quite confused, as the identical code works properly for another
pandas
series.我很困惑,因为相同的代码适用于另一个
pandas
系列。 For example, with the identical code with different series (and different labels, etc.), I was able to produce the following correct figure:例如,使用具有不同系列(和不同标签等)的相同代码,我能够生成以下正确图:
Any idea why this code isn't working for my first series, and/or possible solutions?知道为什么此代码不适用于我的第一个系列和/或可能的解决方案吗?
The "problem", so to speak, is the fact that you labeled your y-axis "probability" when it is not a probability.可以这么说,“问题”是您在 y 轴不是概率时将其标记为“概率”。 The probability is the area under the curve (which is equal to 1).
概率是曲线下的面积(等于 1)。
In your first plot, you have very large density, but very small x-values, so the product of the two remain coherent with a probability.在您的第一个图中,您的密度非常大,但 x 值非常小,因此两者的乘积与概率保持一致。 See probability density function for more info.
有关更多信息,请参阅概率密度函数。
I would edit out your plt.ylabel("probability")
and label it to something else (the correct indicator, that is) or not label it at all.我会编辑您的
plt.ylabel("probability")
并将其标记为其他内容(即正确的指标)或根本不标记它。
I recommend using plt.ylabel("probability density")
.我建议使用
plt.ylabel("probability density")
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.