简体   繁体   English

如何使用matplotlib绘制几个内核密度估计值?

[英]How to plot several kernel density estimates using matplotlib?

I want to plot several "filled" kernel density estimates (KDE) in matplotlib, like the upper halfs of vertical violinplot s or a non overlapping version of the cover art of Joy Division's Unknown Pleasures. 我想在matplotlib中绘制几个“填充的”内核密度估计(KDE),例如垂直violinplot的上半部分或Joy Division的Unknown Pleasures封面艺术的非重叠版本。

Ideally, I want matplotlib to create the density estimates itself, so that I don't have to use scipy's gaussian kde myself. 理想情况下,我希望matplotlib自己创建密度估计,这样我就不必自己使用scipy的高斯kde了

This answer shows how to modify Matplotlib's violinplots . 该答案显示了如何修改Matplotlib的小提琴图 Those violinplots can also be adapted to only show the upper half of a violin plot. 这些小提琴图也可以调整为仅显示小提琴图的上半部分。

pos = np.arange(1, 6) / 2.0
data = [np.random.normal(0, std, size=1000) for std in pos]

violins = plt.violinplot(data,  positions=pos, showextrema=False, vert=False)

for body in violins['bodies']:
    paths = body.get_paths()[0]
    mean = np.mean(paths.vertices[:, 1])
    paths.vertices[:, 1][paths.vertices[:, 1] <= mean] = mean

kde图

A nice looking overlapping variant can be easily created by setting the bodies' transparency to 0, adding an edgecolor and making sure to plot underlying KDEs first: 通过将物体的透明度设置为0,添加边缘颜色并确保首先绘制基础KDE,可以轻松创建外观漂亮的重叠变体:

pos = np.arange(1, 6) / 2
data = [np.random.normal(0, std, size=1000) for std in pos]

violins = plt.violinplot(
    data[::-1], 
    positions=pos[::-1]/5,
    showextrema=False,
    vert=False,

)

for body in violins['bodies']:
    paths = body.get_paths()[0]
    mean = np.mean(paths.vertices[:, 1])
    paths.vertices[:, 1][paths.vertices[:, 1] <= mean] = mean        
    body.set_edgecolor('black')
    body.set_alpha(1)

喜悦分裂图

Note that there is an existing package called joypy , building on top of matplotlib to easily produce such "Joyplots" from dataframes. 请注意,在matplotlib之上有一个名为joypy的现有软件包,可以轻松地从数据帧中生成此类“ Joyplots”。

Apart, there is little reason not to use scipy.stats.gaussian_kde because it is directly providing the KDE. 此外,没有理由不使用scipy.stats.gaussian_kde因为它直接提供了KDE。 violinplot internally also uses it. violinplot内部也使用它。

So the plot in question would look something like 因此,有关的情节看起来像

from  scipy.stats import gaussian_kde
import matplotlib.pyplot as plt
import numpy as np

pos = np.arange(1, 6) / 2.0
data = [np.random.normal(0, std, size=1000) for std in pos]

def plot_kde(data, y0, height, ax=None, color="C0"):
    if not ax: ax = plt.gca()
    x = np.linspace(data.min(), data.max())
    y = gaussian_kde(data)(x)
    ax.plot(x,y0+y/y.max()*height, color=color)
    ax.fill_between(x, y0+y/y.max()*height,y0, color=color, alpha=0.5)

for i, d in enumerate(data):
    plot_kde(d, i, 0.8, ax=None)

plt.show()

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM