简体   繁体   English

在Pandas数据框中显示配对图

[英]Displaying pair plot in Pandas data frame

I am trying to display a pair plot by creating from scatter_matrix in pandas dataframe. 我试图通过在pandas dataframe中创建scatter_matrix来显示一对情节。 This is how the pair plot is created: 这是创建配对图的方式:

# Create dataframe from data in X_train
# Label the columns using the strings in iris_dataset.feature_names
iris_dataframe = pd.DataFrame(X_train, columns=iris_dataset.feature_names)
# Create a scatter matrix from the dataframe, color by y_train
grr = pd.scatter_matrix(iris_dataframe, c=y_train, figsize=(15, 15), marker='o',
hist_kwds={'bins': 20}, s=60, alpha=.8, cmap=mglearn.cm3)

I want to display the pair plot to look something like this; 我想显示对情节看起来像这样;

在此输入图像描述

I am using Python v3.6 and PyCharm and am not using Jupyter Notebook. 我使用的是Python v3.6和PyCharm,并没有使用Jupyter Notebook。

This code worked for me using Python 3.5.2: 这段代码使用Python 3.5.2为我工作:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn import datasets

iris_dataset = datasets.load_iris()
X = iris_dataset.data
Y = iris_dataset.target

iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names)

# Create a scatter matrix from the dataframe, color by y_train
grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

For pandas version < v0.20.0. 对于pandas版本<v0.20.0。

Thanks to michael-szczepaniak for pointing out that this API had been deprecated. 感谢michael-szczepaniak指出此API已被弃用。

grr = pd.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                        hist_kwds={'bins': 20}, s=60, alpha=.8)

I just had to remove the cmap=mglearn.cm3 piece, because I was not able to make mglearn work. 我只需要删除cmap=mglearn.cm3 ,因为我无法使mglearn工作。 There is a version mismatch issue with sklearn. sklearn存在版本不匹配问题。

To not display the image and save it directly to file you can use this method: 要不显示图像并将其直接保存到文件,您可以使用以下方法:

plt.savefig('foo.png')

Also remove 也删除

# %matplotlib inline

在此输入图像描述

Just an update to Vikash's excellent answer. 只是更新了Vikash的优秀答案。 The last two lines should now be: 最后两行现在应该是:

grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

The scatter_matrix function has been moved to the plotting package, so the original answer, while correct is now deprecated. scatter_matrix函数已移至绘图包,因此原始答案虽然正确,但现已弃用。

So the complete code would now be: 所以完整的代码现在是:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn import datasets

iris_dataset = datasets.load_iris()
X = iris_dataset.data
Y = iris_dataset.target

iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names)
# create a scatter matrix from the dataframe, color by y_train
grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

This is also possible using seaborn : 使用seaborn也可以这样

import seaborn as sns

df = sns.load_dataset("iris")
sns.pairplot(df, hue="species")

虹膜数据的Seaborn配对图

I finally know how to do it with PyCharm. 我终于知道如何用PyCharm做到这一点。

Just import matploblib.plotting as plt instead: 只需将matploblib.plotting导入为plt

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import mglearn
from pandas.plotting import scatter_matrix

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris_dataset = load_iris()

X_train,X_test,Y_train,Y_test = train_test_split(iris_dataset['data'],iris_dataset['target'],random_state=0)
iris_dataframe = pd.DataFrame(X_train,columns=iris_dataset.feature_names)

grr = scatter_matrix(iris_dataframe,c = Y_train,figsize = (15,15),marker = 'o',
                        hist_kwds={'bins':20},s=60,alpha=.8,cmap = mglearn.cm3)
plt.show()

Then it works perfect as below: 然后它完美如下:

绘制图像

在此输入图像描述 first of all use pip install mglearn then import the mglearn 首先使用pip install mglearn然后导入mglearn

the code will be like this... 代码就像这样......

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import mglearn
import matplotlib.pyplot as plt

iris_dataframe=pd.DataFrame(X_train,columns=iris_dataset.feature_names)
grr=pd.scatter_matrix(iris_dataframe,
                  c=y_train,figsize=(15,15),marker='o',hist_kwds={'bins':20},
                  s=60,alpha=.8,cmap=mglearn.cm3)
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM