My problem is, how do I plot a regression in seaborn PairGrid which would depend on which variable is plotted and not if it is upper/lower/diagonal position? For example, I have the tips
data set and I believe that the 'size'
is correlated as a second-order polynomial regardless of the other variable, ie. the entire row/column in the pairgrid I want to have like that, but nothing else. However, what I only can do is to map this correlation to the upper/lower triangle to all plots , like this:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
smoke = sns.PairGrid(tips, vars=['total_bill', 'tip','size'])
smoke.map_upper(sns.regplot, color = 'k', order=2)
smoke.map_diag(sns.kdeplot)
smoke.map_lower(sns.regplot, color = 'b')
Is it possible with seaborn? And if I go even further, what if I want to check/plot an exponential correlation between eg. 'tip'
and 'total_bill'
just within the pairgrid, is that possible? How would I do that?
I know I can just take this specific case outside and plot it separately or use GridSpec but I wonder if there is an easier way. Thanks
EDIT (26.4.): The additional question is how to use hue
in this setup. If I use simply:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
vars = ['total_bill', 'tip','size']
smoke = sns.PairGrid(tips, vars=vars, hue='smoker')
smoke.map_upper(plt.scatter)
smoke.map_diag(sns.kdeplot)
smoke.map_lower(plt.scatter)
# Add 2nd order polynomial regression to the 'size' column
for ax,y in zip(smoke.axes[:2,2],vars):
sns.regplot(ax=ax, data=tips, x='size', y=y, order=2, scatter=False)
ax.set_ylabel('')
ax.set_xlabel('')
# Add logarithmic regression
sns.regplot(ax=smoke.axes[2,0], data=tips, x="total_bill", y='size', logx=True, scatter=False)
It does what I want, ie fit logarithmic regression, but very strangely. It puts the blue for the first row only, the orange for a second row only and then it creates green for first col, last row as shown in the following picture. So my question is how to fix it and why it occurs in the first place. Is it that hue
creates new set of axes
that are then needed to be iterated over?
PairGrid
only lets you map the diagonal, the off-diagonal, and the upper and lower triangles. If you want more fine grain control over the plots, you can access the individual axes object using PairGrid.axes
(2D array):
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
vars = ['total_bill', 'tip','size']
smoke = sns.PairGrid(tips, vars=vars)
smoke.map_upper(plt.scatter, color = 'k')
smoke.map_diag(sns.kdeplot)
smoke.map_lower(plt.scatter, color = 'b')
# Add 2nd order polynomial regression to the 'size' column
for ax,y in zip(smoke.axes[:2,2],vars):
sns.regplot(ax=ax, data=tips, x='size', y=y, order=2, color='k', scatter=False)
# Add logarithmic regression
sns.regplot(ax=smoke.axes[2,0], data=tips, x="total_bill", y='size', logx=True, color='b', scatter=False)
EDIT: solution that works with hue-splitting
In this case, you have to do the regression on each subset of the data and plot on the same axes.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
vars = ['total_bill', 'tip','size']
hue_col = 'smoker'
hue_order=['Yes','No']
smoke = sns.PairGrid(tips, vars=vars, hue='smoker', hue_order=hue_order)
smoke.map_upper(plt.scatter)
smoke.map_diag(sns.kdeplot)
smoke.map_lower(plt.scatter)
# Add 2nd order polynomial regression to the 'size' column
for ax,y in zip(smoke.axes[:2,2],vars):
for hue in hue_order:
sns.regplot(ax=ax, data=tips.loc[tips[hue_col]==hue], x='size', y=y, order=2, scatter=False)
ax.set_ylabel('')
ax.set_xlabel('')
# Add logarithmic regression
for hue in hue_order:
sns.regplot(ax=smoke.axes[2,0], data=tips.loc[tips[hue_col]==hue], x="total_bill", y='size', logx=True, scatter=False)
Yes, it's possible, because you can specify the x- and y-variables separately, eg
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
smoke = sns.PairGrid(tips, x_vars=['total_bill', 'tip','size'], y_vars=['size'])
smoke.map(sns.regplot, color = 'k', order=2)
smoke.map_diag(sns.kdeplot)
To plot various kinds of regression functions, you would have to access each axes (subplot) individually.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.