简体   繁体   中英

How to draw a figure by seaborn pairplot in several rows?

I have a dataset with 76 features and 1 dependent variable (y). I use seaborn to draw pairplot between features and y in Jupyter notebook. Since the No. of features is high, size of plot for every feature is very small, as can be seen below:

在此处输入图片说明

I am looking for a way to draw pairplot in several rows. Also, I don't want to copy and paste pairplot code in several cells in notebook. I am looking for a way to make this figure automatically.

The code I am using (I cannot share dataset, so I use a sample dataset):

from sklearn.datasets import load_boston
import math
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

X, y = load_boston(return_X_y=True)
X = pd.DataFrame(X)
y = pd.DataFrame(y)
data = pd.concat([X, y], axis=1)

y_name = 'y'
features_names = [f'feature_{i}' for i in range(1, X.shape[1]+1)]  
column_names = features_names + [y_name]
data.columns = column_names

plot_size=7
num_plots_x=5   # No. of plots in every row
num_plots_y = math.ceil(len(features_names)/num_plots_x)   # No. of plots in y direction

fig = plt.figure(figsize=(plot_size*num_plots_y, plot_size*num_plots_x), facecolor='white')
axes = [fig.add_subplot(num_plots_y,1,i+1) for i in range(num_plots_y)]   

for i, ax in enumerate(axes):   
    start_index = i * num_plots_x
    end_index = (i+1) * num_plots_x
    if end_index > len(features_names): end_index = len(features_names)
    sns.pairplot(x_vars=features_names[start_index:end_index], y_vars=y_name, data = data)

plt.savefig('figure.png')

The above code has two problems. It shows empty box at the top of the figure and then it shows the pairplots. Following is part of the figure that I get.

在此处输入图片说明

Second problem is that it only saves the last row as png file, not the whole figure.

If you have any idea to solve this, please let me know. Thank you.

When I run it directly ( python script.py ) then it opens every row in separated window - so it treats it as separated objects and it saves in file only last object.

Other problem is that sns doesn't need fig and axes - it can't use subplots to put all on one image - and when I remove fig axes then it stops showing first window with empty box.


I found that FacetGrid has col_wrap to put in many rows. And I found that someone suggested to add this col_wrap in pairplot - Add parameter col_wrap to pairplot #2121 and there is also example how to FacetGrid with scatterplot instead of pairplot and then it can use col_wrap .


Here is code which use FacetGrid with col_wrap

from sklearn.datasets import load_boston
import math
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

X, y = load_boston(return_X_y=True)
X = pd.DataFrame(X)
y = pd.DataFrame(y)
data = pd.concat([X, y], axis=1)

y_name = 'y'
features_names = [f'feature_{i}' for i in range(1, X.shape[1]+1)]  
column_names = features_names + [y_name]
data.columns = column_names

plot_size=7
num_plots_x=5   # No. of plots in every row
num_plots_y = math.ceil(len(features_names)/num_plots_x)   # No. of plots in y direction

'''
for i in range(num_plots_y):
    start = i * num_plots_x
    end = start + num_plots_x
    sns.pairplot(x_vars=features_names[start:end], y_vars=y_name, data=data)
'''

g = sns.FacetGrid(pd.DataFrame(features_names), col=0, col_wrap=4, sharex=False)
for ax, x_var in zip(g.axes, features_names):
    sns.scatterplot(data=data, x=x_var, y=y_name, ax=ax)
g.tight_layout()

plt.savefig('figure.png')
plt.show()

Result ( 'figure.png' ):

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM