简体   繁体   English

如何 plot seaborn histplot 的子图数量不均

[英]How to plot uneven number of subplots for seaborn histplot

I currently have a list of 13 columns I am plotting a distribution of.我目前有一个我正在绘制分布的 13 列的列表。 I would like to create a series of subplots so that the plots take up less space but am having a difficult time doing so within a loop.我想创建一系列子图,以便这些图占用更少的空间,但在循环中这样做很困难。

Sample DataFrame:样品 DataFrame:

import pandas as pd
import numpy as np

data = {'identifier': ['A', 'B', 'C', 'D'],
        'treatment': ['untreated', 'treated', 'untreated', 'treated'], 'treatment_timing': ['pre', 'pre', 'post', 'post'],
        'subject_A': [1.3, 0.0, 0.5, 1.6], 'subject_B': [2.0, 1.4, 0.0, 0.0], 'subject_C': [nan, 3.0, 2.0, 0.5],
        'subject_D': [np.nan, np.nan, 1.0, 1.6], 'subject_E': [0, 0, 0, 0], 'subject_F': [1.0, 1.0, 0.4, 0.5]}

df = pd.DataFrame(data)

  identifier  treatment treatment_timing  subject_A  subject_B  subject_C  subject_D  subject_E  subject_F
0          A  untreated              pre        1.3        2.0        NaN        NaN          0        1.0
1          B    treated              pre        0.0        1.4        3.0        NaN          0        1.0
2          C  untreated             post        0.5        0.0        2.0        1.0          0        0.4
3          D    treated             post        1.6        0.0        0.5        1.6          0        0.5
  • It goes from subject_A to subject_M (13 total).它从 subject_A 到 subject_M(共 13 个)。
  • What I am currently doing produces a 13 row, 1 column layout of 13 histograms.我目前所做的会产生 13 个直方图的 13 行 1 列布局。 One for each subject, separated into 3 colors (pre, post and missing).每个主题一个,分为 3 个 colors(前、后和缺失)。

在此处输入图像描述

Here is what I currently have:这是我目前拥有的:

fig, axes = plt.subplots(3,5, sharex=True, figsize=(12,6))

for index, col in enumerate(COL_LIST):
    sns.histplot(
            df ,x=col, hue="time", multiple="dodge", bins=10, ax=axes[index,index % 3]
        ).set_title(col.replace("_", " "))
plt.tight_layout()

This definitely doesn't work.这绝对行不通。 But I'm not sure if there's an easy way to define the axes without having to copy and paste this line 13 times and manually define the axes coordinates.但我不确定是否有一种简单的方法来定义轴,而不必复制和粘贴这条线 13 次并手动定义轴坐标。

Using displot is somewhat troublesome because col_wrap errors out使用 displot 有点麻烦,因为 col_wrap 错误

ValueError: Number of rows must be a positive integer, not 0

(I believe this is due to presence of np.nan) (我相信这是由于 np.nan 的存在)

  • It will be easier to use seaborn.displot , which is a FacetGrid , instead of seaborn.histplot .使用seaborn.displot会更容易,它是一个FacetGrid ,而不是seaborn.histplot
    • Explore using row , col , and col_wrap to get the number of rows and column as desired.探索使用rowcolcol_wrap来获取所需的行数和列数。
  • The subject_ columns must be stacked, to convert the dataframe to a tidy format, which can be done with .stack subject_列必须堆叠,以将 dataframe 转换为整齐的格式,可以使用.stack完成
import pandas as pd
import seaborn as sns

# convert the dataframe into a long form with stack
df_long = df.set_index(['identifier', 'treatment', 'treatment_timing']).stack().reset_index().rename(columns={'level_3': 'subject', 0: 'vals'})

# sort by subject
df_long = df_long.sort_values('subject').reset_index(drop=True)

# display(df_long.head())
  identifier  treatment treatment_timing    subject  vals
0          A  untreated              pre  subject_A   1.3
1          D    treated             post  subject_A   1.6
2          C  untreated             post  subject_A   0.5
3          B    treated              pre  subject_A   0.0
4          D    treated             post  subject_B   0.0

# plot with displot
sns.displot(data=df_long, row='subject', col='treatment', x='vals', hue='treatment_timing', bins=10)

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM