简体   繁体   English

如何在 matplotlib 中创建密度图?

[英]How to create a density plot in matplotlib?

In RI can create the desired output by doing:在 RI 中可以通过执行以下操作来创建所需的输出:

data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
         rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))

R 中的密度图

In python (with matplotlib) the closest I got was with a simple histogram:在python(使用matplotlib)中,我得到的最接近的是一个简单的直方图:

import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()

matplotlib 中的直方图

I also tried the normed=True parameter but couldn't get anything other than trying to fit a gaussian to the histogram.我也尝试了 normed=True 参数,但除了尝试将高斯拟合到直方图之外,什么也没有。

My latest attempts were around scipy.stats and gaussian_kde , following examples on the web, but I've been unsuccessful so far.我最近的尝试是围绕scipy.statsgaussian_kde ,遵循网络上的例子,但到目前为止我没有成功。

Five years later, when I Google "how to create a kernel density plot using python", this thread still shows up at the top!五年后,当我谷歌“如何使用 python 创建核密度图”时,这个线程仍然出现在顶部!

Today, a much easier way to do this is to use seaborn , a package that provides many convenient plotting functions and good style management.今天,一个更简单的方法是使用seaborn ,一个提供许多方便的绘图功能和良好的样式管理的包。

import numpy as np
import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.set_style('whitegrid')
sns.kdeplot(np.array(data), bw=0.5)

在此处输入图片说明

Sven has shown how to use the class gaussian_kde from Scipy, but you will notice that it doesn't look quite like what you generated with R. This is because gaussian_kde tries to infer the bandwidth automatically. Sven 已经展示了如何使用 Scipy 中的gaussian_kde类,但是您会注意到它看起来与您使用 R 生成的不太一样。这是因为gaussian_kde尝试自动推断带宽。 You can play with the bandwidth in a way by changing the function covariance_factor of the gaussian_kde class.您可以通过更改gaussian_kde类的函数covariance_factor以某种方式使用带宽。 First, here is what you get without changing that function:首先,这是您在不更改该功能的情况下获得的结果:

替代文字

However, if I use the following code:但是,如果我使用以下代码:

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = gaussian_kde(data)
xs = np.linspace(0,8,200)
density.covariance_factor = lambda : .25
density._compute_covariance()
plt.plot(xs,density(xs))
plt.show()

I get我得到

替代文字

which is pretty close to what you are getting from R. What have I done?这与您从 R 中得到的非常接近。我做了什么? gaussian_kde uses a changable function, covariance_factor to calculate its bandwidth. gaussian_kde使用可变函数covariance_factor来计算其带宽。 Before changing the function, the value returned by covariance_factor for this data was about .5.在更改函数之前,covariance_factor 为该数据返回的值约为 0.5。 Lowering this lowered the bandwidth.降低这会降低带宽。 I had to call _compute_covariance after changing that function so that all of the factors would be calculated correctly.在更改该函数后,我不得不调用_compute_covariance以便正确计算所有因素。 It isn't an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.它与 R 中的 bw 参数并不完全对应,但希望它可以帮助您找到正确的方向。

Option 1:选项1:

Use pandas dataframe plot (built on top of matplotlib ):使用pandas数据框图(建立在matplotlib之上):

import pandas as pd
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
pd.DataFrame(data).plot(kind='density') # or pd.Series()

在此处输入图片说明

Option 2:选项 2:

Use distplot of seaborn :使用distplotseaborn

import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.distplot(data, hist=False)

在此处输入图片说明

Maybe try something like:也许尝试这样的事情:

import matplotlib.pyplot as plt
import numpy
from scipy import stats
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = stats.kde.gaussian_kde(data)
x = numpy.arange(0., 8, .1)
plt.plot(x, density(x))
plt.show()

You can easily replace gaussian_kde() by a different kernel density estimate.您可以使用不同的内核密度估计轻松替换gaussian_kde()

The density plot can also be created by using matplotlib: The function plt.hist(data) returns the y and x values necessary for the density plot (see the documentation https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html ).也可以使用 matplotlib 创建密度图:函数 plt.hist(data) 返回密度图所需的 y 和 x 值(请参阅文档https://matplotlib.org/3.1.1/api/_as_gen/ matplotlib.pyplot.hist.html )。 Resultingly, the following code creates a density plot by using the matplotlib library:结果,以下代码使用 matplotlib 库创建密度图:

import matplotlib.pyplot as plt
dat=[-1,2,1,4,-5,3,6,1,2,1,2,5,6,5,6,2,2,2]
a=plt.hist(dat,density=True)
plt.close()
plt.figure()
plt.plot(a[1][1:],a[0])      

This code returns the following density plot此代码返回以下密度图

在此处输入图片说明

You can do something like:您可以执行以下操作:

s = np.random.normal(2, 3, 1000)
import matplotlib.pyplot as plt
count, bins, ignored = plt.hist(s, 30, density=True)
plt.plot(bins, 1/(3 * np.sqrt(2 * np.pi)) * np.exp( - (bins - 2)**2 / (2 * 3**2) ), 
linewidth=2, color='r')
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM