简体   繁体   中英

How to add a mean and median line to a Seaborn displot

Is there a way to add the mean and median to Seaborn's displot ?

penguins = sns.load_dataset("penguins")
g = sns.displot(
    data=penguins, x='body_mass_g',
    col='species',  
    facet_kws=dict(sharey=False, sharex=False)
)

在此处输入图像描述

Based on Add mean and variability to seaborn FacetGrid distplots , I see that I can define a FacetGrid and map a function. Can I pass a custom function to displot ?

The reason for trying to use displot directly is that the plots are much prettier out of the box, without tweaking tick label size, axis label size, etc. and are visually consistent with other plots I am making.

def specs(x, **kwargs):
    ax = sns.histplot(x=x)
    ax.axvline(x.mean(), color='k', lw=2)
    ax.axvline(x.median(), color='k', ls='--', lw=2)

g = sns.FacetGrid(data=penguins, col='species')
g.map(specs,'body_mass_g' )

在此处输入图像描述

Option 1

  • Use plt. instead of ax .
    • In the OP, the vlines are going to ax for the histplot , but here, the figure is created before .map .
penguins = sns.load_dataset("penguins")
g = sns.displot(
    data=penguins, x='body_mass_g',
    col='species',  
    facet_kws=dict(sharey=False, sharex=False)
)

def specs(x, **kwargs):
    plt.axvline(x.mean(), c='k', ls='-', lw=2.5)
    plt.axvline(x.median(), c='orange', ls='--', lw=2.5)

g.map(specs,'body_mass_g' )

Option 2

  • This option is more verbose, but more flexible in that it allows for accessing and adding information from a data source other than the one used to create the displot .
import seaborn as sns
import pandas as pd

# load the data
pen = sns.load_dataset("penguins")

# groupby to get mean and median
pen_g = pen.groupby('species').body_mass_g.agg(['mean', 'median'])

g = sns.displot(
    data=pen, x='body_mass_g',
    col='species',  
    facet_kws=dict(sharey=False, sharex=False)
)
# extract and flatten the axes from the figure
axes = g.axes.flatten()

# iterate through each axes
for ax in axes:
    # extract the species name
    spec = ax.get_title().split(' = ')[1]
    
    # select the data for the species
    data = pen_g.loc[spec, :]
    
    # print data as needed or comment out
    print(data)
    
    # plot the lines
    ax.axvline(x=data['mean'], c='k', ls='-', lw=2.5)
    ax.axvline(x=data['median'], c='orange', ls='--', lw=2.5)

Output for both options

在此处输入图像描述

Resources

Here you can use sns.FacetGrid.facet_data to iterate the indexes of the subplots and the underlying data.

This is close to how sns.FacetGrid.map works under the hood. sns.FacetGrid.facet_data is a generator that yields a tuple (i, j, k) of row, col, hue index and the data which is a DataFrame that is a subset of the full data corresponding to each facet.

import seaborn as sns
import pandas as pd


pen = sns.load_dataset("penguins")

# Set our x_var for later use
x_var = "body_mass_g"

g = sns.displot(
    data=pen,
    x=x_var,
    col="species",
    facet_kws=dict(sharey=False, sharex=False),
)

for (row, col, hue_idx), data in g.facet_data():
    # Skip empty data
    if not data.values.size:
        continue

    # Get the ax for `row` and `col`
    ax = g.facet_axis(row, col)
    # Set the `vline`s using the var `x_var`
    ax.axvline(data[x_var].mean(), c="k", ls="-", lw=2.5)
    ax.axvline(data[x_var].median(), c="orange", ls="--", lw=2.5)

Which outputs: FacetGrid 具有用于均值和中值的叠加 v 线

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM