Seaborn distribution plot line graph shows ringing

Question

In my Seaborn graph with code

import seaborn as sns
df['hour'] = df['time']/3600
plt.figure(figsize=(30,10))               
plt.gcf().subplots_adjust(left = 0.3)
g = sns.distplot(df['hour'], axlabel = 'No. Of Hours', label = 'Frequency')

the line graph that comes with the distribution plot looks really weird as it presents a spike for each bar and shows an increasing trend at the right tail of the distribution graph, where little data was present. Is this graph correct? If not, what is wrong about it and how can I correct it? Here is the graph:

Answer 1

This issue is reported at http://github.com/mwaskom/seaborn/issues/1590

It has to do with the KDE algorithm used by the statsmodels package. You can force seaborn to use scipy's algorithm instead by adding this line:

sns.distributions._has_statsmodels = False

Here is a short snippet that reproduces the issue:

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
    'hour': [1] + 100 * [2] + [10]
})
plt.figure(figsize=(30,10))               
plt.gcf().subplots_adjust(left = 0.3)
g = sns.distplot(df['hour'], axlabel = 'No. Of Hours', label = 'Frequency')

And here's the result if you force it to not use statsmodels:

sns.distributions._has_statsmodels = False
plt.figure(figsize=(30,10))               
plt.gcf().subplots_adjust(left = 0.3)
g = sns.distplot(df['hour'], axlabel = 'No. Of Hours', label = 'Frequency')

Seaborn distribution plot line graph shows ringing

Question

1 answers

solution1
1 ACCPTED 2020-03-10 03:01:54

Seaborn distribution plot line graph shows ringing

Question

1 answers

solution1 1 ACCPTED 2020-03-10 03:01:54

solution1
1 ACCPTED 2020-03-10 03:01:54