简体   繁体   English

从 Seaborn distplot 获取数据点

[英]Get data points from Seaborn distplot

I use我用

sns.distplot 

to plot a univariate distribution of observations.绘制观测值的单变量分布。 Still, I need not only the chart, but also the data points.尽管如此,我不仅需要图表,还需要数据点。 How do I get the data points from matplotlib Axes (returned by distplot)?如何从 matplotlib 轴(由 distplot 返回)获取数据点?

You can use the matplotlib.patches API .您可以使用matplotlib.patches API For instance, to get the first line:例如,要获取第一行:

sns.distplot(x).get_lines()[0].get_data()

This returns two numpy arrays containing the x and y values for the line.这将返回两个包含行的 x 和 y 值的 numpy 数组。

For the bars, information is stored in:对于条形图,信息存储在:

sns.distplot(x).patches

You can access the bar's height via the function patches.get_height() :您可以通过函数patches.get_height()访问栏的高度:

[h.get_height() for h in sns.distplot(x).patches]

If you want to obtain the kde values of an histogram you can use scikit-learn KernelDensity function instead:如果要获取直方图的 kde 值,可以使用scikit-learn KernelDensity函数:

import numpy as np
import pandas as pd
from sklearn.neighbors import KernelDensity

ds=pd.read_csv('data-to-plot.csv')
X=ds.loc[:,'Money-Spent'].values[:, np.newaxis]


kde = KernelDensity(kernel='gaussian', bandwidth=0.75).fit(X) #you can supply a bandwidth
                                                              #parameter. 

x=np.linspace(0,5,100)[:, np.newaxis]

log_density_values=kde.score_samples(x)
density=np.exp(log_density_values)

array([1.88878660e-05, 2.04872903e-05, 2.21864649e-05, 2.39885206e-05,
       2.58965064e-05, 2.79134003e-05, 3.00421245e-05, 3.22855645e-05,
       3.46465903e-05, 3.71280791e-05, 3.97329392e-05, 4.24641320e-05,
       4.53246933e-05, 4.83177514e-05, 5.14465430e-05, 5.47144252e-05,
       5.81248850e-05, 6.16815472e-05, 6.53881807e-05, 6.92487062e-05,
       7.32672057e-05, 7.74479375e-05, 8.17953578e-05, 8.63141507e-05,
       ..........................
       ..........................
       3.93779919e-03, 4.15788216e-03, 4.38513011e-03, 4.61925890e-03,
       4.85992626e-03, 5.10672757e-03, 5.35919187e-03, 5.61677855e-03])

This will get the kde curve you want这将得到你想要的kde曲线

line = sns.distplot(data).get_lines()[0]
plt.plot(line.get_xdata(), line.get_ydata())

With the newer version of seaborn this is not the case anymore.对于较新版本的 seaborn,情况已不再如此。 First of all, distplot has been replaced with displot.首先,distplot 已经被 displot 取代。 Secondly, when calling get_lines() an error message comes up AttributeError: 'FacetGrid' object has no attribute 'get_lines'.其次,在调用 get_lines() 时会出现一条错误消息 AttributeError:“FacetGrid”对象没有属性“get_lines”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM