简体   繁体   English

python:具有多个分布的distplot

[英]python: distplot with multiple distributions

I am using seaborn to plot a distribution plot.我正在使用 seaborn 绘制分布图。 I would like to plot multiple distributions on the same plot in different colors:我想用不同的颜色在同一个图上绘制多个分布:

Here's how I start the distribution plot:这是我如何开始分布图:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],columns= iris['feature_names'] + ['target'])

sns.distplot(iris[['sepal length (cm)']], hist=False, rug=True);

The 'target' column contains 3 values: 0,1,2. “目标”列包含 3 个值:0、1、2。

I would like to see one distribution plot for sepal length where target ==0, target ==1, and target ==2 for a total of 3 plots.我想看到一个萼片长度分布图,其中目标 ==0、目标 ==1 和目标 ==2,总共 3 个图。

Does anyone know how I do that?有谁知道我是怎么做到的?

Thank you.谢谢。

The important thing is to sort the dataframe by values where target is 0 , 1 , or 2 .重要的是按target012值对数据框进行排序。

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                    columns=iris['feature_names'] + ['target'])

# Sort the dataframe by target
target_0 = iris.loc[iris['target'] == 0]
target_1 = iris.loc[iris['target'] == 1]
target_2 = iris.loc[iris['target'] == 2]

sns.distplot(target_0[['sepal length (cm)']], hist=False, rug=True)
sns.distplot(target_1[['sepal length (cm)']], hist=False, rug=True)
sns.distplot(target_2[['sepal length (cm)']], hist=False, rug=True)

plt.show()

The output looks like:输出看起来像:

在此处输入图片说明

If you don't know how many values target may have, find the unique values in the target column, then slice the dataframe and add to the plot appropriately.如果您不知道target可能有多少个值,请在target列中找到唯一值,然后对数据框进行切片并适当地添加到图中。

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                    columns=iris['feature_names'] + ['target'])

unique_vals = iris['target'].unique()  # [0, 1, 2]

# Sort the dataframe by target
# Use a list comprehension to create list of sliced dataframes
targets = [iris.loc[iris['target'] == val] for val in unique_vals]

# Iterate through list and plot the sliced dataframe
for target in targets:
    sns.distplot(target[['sepal length (cm)']], hist=False, rug=True)

sns.plt.show()

A more common approach for this type of problems is to recast your data into long format using melt, and then let map do the rest.解决此类问题的一种更常见的方法是使用melt 将数据重新转换为长格式,然后让map 完成剩下的工作。

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']], 
                    columns=iris['feature_names'] + ['target'])

# recast into long format 
df = iris.melt(['target'], var_name='cols',  value_name='vals')

df.head()

   target               cols  vals
0     0.0  sepal length (cm)   5.1
1     0.0  sepal length (cm)   4.9
2     0.0  sepal length (cm)   4.7
3     0.0  sepal length (cm)   4.6
4     0.0  sepal length (cm)   5.0

You can now plot simply by creating a FacetGrid and using map:您现在可以通过创建 FacetGrid 并使用地图来简单地绘制:

g = sns.FacetGrid(df, col='cols', hue="target", palette="Set1")
g = (g.map(sns.distplot, "vals", hist=False, rug=True))

在此处输入图片说明

I have found a simpler solution using FacetGrid on https://github.com/mwaskom/seaborn/issues/861 by citynorman :我发现用简单的解决方案FacetGridhttps://github.com/mwaskom/seaborn/issues/861通过citynorman:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],columns= iris['feature_names'] + ['target'])

g = sns.FacetGrid(iris, hue="target")
g = g.map(sns.distplot, "sepal length (cm)",  hist=False, rug=True)

在此处输入图片说明

Anyone trying to build the same plot using the new 0.11.0 version, Seaborn has or is deprecating distplot and replacing it with displot.任何试图使用新的 0.11.0 版本构建相同绘图的人,Seaborn 已经或正在弃用 distplot 并将其替换为 displot。

So the new version wise the code would be:所以新版本明智的代码是:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                    columns=iris['feature_names'] + ['target'])

sns.displot(data=iris, x='sepal length (cm)', hue='target', kind='kde', fill=True, palette=sns.color_palette('bright')[:3], height=5, aspect=1.5)

在此处输入图片说明

A more recent and simpler option:一个更新和更简单的选项:

sns.displot(data=iris, x='sepal length (cm)', hue='target', kind='kde')

在此处输入图片说明

If anyone is looking to get a facetgrid of distplots or histograms in seaborn, the new sns.displot function has facetgrid built into it!如果有人想在 seaborn 中获得 distplots 或直方图的 facetgrid,新的sns.displot函数内置了 facetgrid! This makes it quite easy to use if you melt the iris dataframe first.如果您先融化 iris 数据框,这将非常容易使用。

Building on and updating the code in previous answers by @Abbas and @Amit Amola:在@Abbas 和@Amit Amola 之前的答案中构建和更新代码:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],columns= iris['feature_names'] + ['target'])
iris['target'] = iris['target'].astype(str)

iris_melt = iris.melt(id_vars='target')

iris_melt.head()

   target   variable       value
0   0.0 sepal length (cm)   5.1
1   0.0 sepal length (cm)   4.9
2   0.0 sepal length (cm)   4.7
3   0.0 sepal length (cm)   4.6
4   0.0 sepal length (cm)   5.0

sns.displot(
    data=iris_melt, 
    x='value', 
    hue='target', 
    kind='kde', 
    fill=True,
    col='variable'
)

The image is small here, but if you right click on the image and open it in a new tab or window, you can see the details better.这里的图像很小,但是如果您右键单击图像并在新选项卡或窗口中打开它,您可以更好地看到细节。

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM