简体   繁体   中英

Seaborn and pd.scatter_matrix() plot color issues

I am making a pd.scatter_matrix() plot from a DataFrame based on the Iris dataset colored by the target variable (plant species). When I run the code below I get a scatter matrix with black, grey and white (!) colored scattering points which hinders visualization. The grid seems inconsistent too, apparently only the plots close to the axis get the respective gridding. I wanted a nice grid and scatter matrix following the sns default color palette (blue, green, red).

Why is seaborn plot style and the use of pd.scatter_matrix() enforcing a different (awful!) color palette then the defaults for the scatter plots and inconsistent grid lines? How can I solve these visualization issues?

I already updated seaborn to a fairly recent version (0.8 of July 2017). Also tried the non-deprecated version the scatter_matrix plot for pandas pd.plotting.scatter_matrix() and had no luck. If I use the 'ggplot' style the color palette is correct for the scatter plots but the grids are still inconsistent.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn')
from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target
df = pd.DataFrame(X, columns = iris.feature_names)

pd.scatter_matrix(df, c=y, figsize = [8,8],
                      s=80, marker = 'D');

在此处输入图片说明

Package versions:

pandas version: 0.20.1
matplotlib version: 2.0.2
seaborn version:0.8.0

I am not sure if this answers your question but you could use the pairplot. let me know..

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target
df = pd.DataFrame(X, columns = iris.feature_names)

pd.plotting.scatter_matrix(df, c=y, figsize = [8,8],
                      s=80, marker = 'D');
df['y'] = y

sns.pairplot(df,hue='y')

which gives you:

在此处输入图片说明

If you want to avoid that the last line of the visualizations then:

import seaborn as sns
sns.set(style="ticks", color_codes=True)
iris = sns.load_dataset("iris")
%matplotlib inline

iris = sns.load_dataset("iris")
sns.pairplot(iris, hue="species")

在此处输入图片说明

Default matplotlib setting are not very aesthetic; however, do not underestimate the power of matplotlib .

The simplest solution to your problem might be:

plt.style.use('ggplot') # this is the trick

from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target
df = pd.DataFrame(X, columns = iris.feature_names)

pd.scatter_matrix(df, c=y, figsize = [10,10], s=50);

在此处输入图片说明

(full list of styles available can be accessed via plt.style.available )

You may further customize the plot to your needs adjusting matplotlibrc file. An example of what could be done with it could be found here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM