简体   繁体   中英

Get data array from a Seaborn pairplot

I have used the seaborn pairplot function and would like to extract a data array.

import seaborn as sns

iris = sns.load_dataset("iris")
sns.pairplot(iris, hue="species")

I want to get an array of the points I show below in black color:

在此处输入图像描述

Thanks.

Just this line:

data = iris[iris['species'] == 'setosa']['sepal_length']

You are interested in the blue line, so the 'setosa' scpecie. In order to filter the iris dataframe, I create this filter:

iris['species'] == 'setosa'

which is a boolean array, whose values are True if the corresponding row in the 'species' columns of the iris dataframe is 'setosa' , False otherwise. With this line of code:

iris[iris['species'] == 'setosa']

I apply the filter to the dataframe, in order to extract only the rows associated with the 'setosa' specie. Finally, I extract the 'sepal_length' column:

iris[iris['species'] == 'setosa']['sepal_length']

If I plot a KDE for this data array with this code:

data = iris[iris['species'] == 'setosa']['sepal_length']
sns.kdeplot(data)

I get:

在此处输入图像描述

that is the plot above you are interested in

The values are different from the plot above by the way KDE is calculated.
I quote this reference :

The y-axis in a density plot is the probability density function for the kernel density estimation. However, we need to be careful to specify this is a probability density and not a probability. The difference is the probability density is the probability per unit on the x-axis. To convert to an actual probability, we need to find the area under the curve for a specific interval on the x-axis. Somewhat confusingly, because this is a probability density and not a probability, the y-axis can take values greater than one. The only requirement of the density plot is that the total area under the curve integrates to one. I generally tend to think of the y-axis on a density plot as a value only for relative comparisons between different categories.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM