简体   繁体   中英

How to overlay a scatterplot on top of boxplot with sns.catplot?

It is possible to combine axes-level plot functions by simply calling them successively:

import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
sns.set_theme(style="whitegrid")
ax = sns.boxplot(x="day", y="total_bill", data=tips)
ax = sns.stripplot(x="day", y="total_bill", data=tips, 
                   color=".25", alpha=0.7, ax=ax)
plt.show()

How to achieve this for the figure-level functionsns.catplot() ? Successive calls to sns.catplot() creates a new figure each time, and passing a figure handle is not possible.

# This creates two separate figures:
sns.catplot(..., kind="box")
sns.catplot(..., kind="strip")

The following works for me with seaborn v0.11:

import seaborn as sns 
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
g = sns.catplot(x="sex", y="total_bill", hue="smoker", col="time",
                data=tips, kind="box",
                palette=["#FFA7A0", "#ABEAC9"],
                height=4, aspect=.7);
g.map_dataframe(sns.stripplot, x="sex", y="total_bill", 
                hue="smoker", palette=["#404040"], 
                alpha=0.6, dodge=True)
# g.map(sns.stripplot, "sex", "total_bill", "smoker", 
#       palette=["#404040"], alpha=0.6, dodge=True)
plt.show()

Explanations: In a first pass, the box-plots are created using sns.catplot() . The function returns a sns.FacetGrid that accommodates the different axes for each value of the categorical parameter time . In a second pass, this FacetGrid is reused to overlay the scatter plot ( sns.stripplot , or alternatively, sns.swarmplot ). The above uses method map_dataframe() because data is a pandas DataFrame with named columns. (Alternatively, using map() is also possible.) Setting dodge=True makes sure that the scatter plots are shifted along the categorical axis for each hue category. Finally, note that by calling sns.catplot() with kind="box" and then overlaying the scatter in a second step, the problem of duplicated legend entries is implicitly circumvented.

结果图

Alternative (not recommended): It is also possible to create a FacetGrid object first and then call map_dataframe() twice. While this works for this example, in other situations one has to make sure that the mapping of properties is synchronized correctly across facets (see the warning in the docs ). sns.catplot() takes care of this, as well as the legend.

g = sns.FacetGrid(tips, col="time", height=4, aspect=.7)
g.map_dataframe(sns.boxplot, x="sex", y="total_bill", hue="smoker", 
                palette=["#FFA7A0", "#ABEAC9"])
g.map_dataframe(sns.stripplot, x="sex", y="total_bill", hue="smoker", 
                palette=["#404040"], alpha=0.6, dodge=True)
# Note: the default legend is not resulting in the correct entries.
#       Some fix-up step is required here...
# g.add_legend()
plt.show()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM