[英]Seaborn distribution plot line graph shows ringing
In my Seaborn graph with code在我的带有代码的 Seaborn 图中
import seaborn as sns
df['hour'] = df['time']/3600
plt.figure(figsize=(30,10))
plt.gcf().subplots_adjust(left = 0.3)
g = sns.distplot(df['hour'], axlabel = 'No. Of Hours', label = 'Frequency')
the line graph that comes with the distribution plot looks really weird as it presents a spike for each bar and shows an increasing trend at the right tail of the distribution graph, where little data was present.分布图附带的折线图看起来真的很奇怪,因为它为每个条形呈现了一个尖峰,并在分布图的右尾显示了增加的趋势,那里几乎没有数据。 Is this graph correct?这个图形正确吗? If not, what is wrong about it and how can I correct it?如果没有,它有什么问题,我该如何纠正? Here is the graph:这是图表:
This issue is reported at http://github.com/mwaskom/seaborn/issues/1590此问题在http://github.com/mwaskom/seaborn/issues/1590上报告
It has to do with the KDE algorithm used by the statsmodels package.它与 statsmodels 包使用的 KDE 算法有关。 You can force seaborn to use scipy's algorithm instead by adding this line:您可以通过添加以下行来强制 seaborn 使用 scipy 的算法:
sns.distributions._has_statsmodels = False
Here is a short snippet that reproduces the issue:这是重现该问题的简短片段:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'hour': [1] + 100 * [2] + [10]
})
plt.figure(figsize=(30,10))
plt.gcf().subplots_adjust(left = 0.3)
g = sns.distplot(df['hour'], axlabel = 'No. Of Hours', label = 'Frequency')
And here's the result if you force it to not use statsmodels:如果您强制它不使用 statsmodels,则结果如下:
sns.distributions._has_statsmodels = False
plt.figure(figsize=(30,10))
plt.gcf().subplots_adjust(left = 0.3)
g = sns.distplot(df['hour'], axlabel = 'No. Of Hours', label = 'Frequency')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.