简体   繁体   English

熊猫和海豹的分组箱形图

[英]Grouped boxplots in pandas and seaborn

I have the foll. 我有傻瓜。 dataframe: 数据框:

     season           A         B         C         D
0   current   26.978912  0.039233  1.248607  0.025874
1   current   26.978912  0.039233  0.836786  0.025874
2   current   26.978912  0.039233  3.047536  0.025874
3   current   26.978912  0.039233  3.726964  0.025874
4   current   26.978912  0.039233  1.171393  0.025874
5   current   26.978912  0.039233  0.180929  0.025874
6   current   26.978912  0.039233  0.000000  0.025874
7   current   34.709560  0.039233  0.700893  0.025874
8   current  111.140200  0.306142  3.068286  0.169244
9   current  111.140200  0.306142  2.931107  0.169244
10  current  111.140200  0.306142  2.121893  0.169244
11  current  111.140200  0.306142  1.479464  0.169244
12  current  111.140200  0.306142  2.186821  0.169244
13  current  111.140200  0.306142  9.542714  0.169244
14  current  111.140200  0.306142  9.890750  0.169244
15  current  111.140200  0.306142  8.864857  0.169244
16     past   88.176415  0.257901  3.416059  0.141809
17     past   88.176415  0.257901  4.835357  0.141809
18     past   88.176415  0.257901  5.238097  0.141809
19     past   88.176415  0.257901  5.535355  0.141809
20     past   88.176415  0.257901  6.479523  0.141809
21     past   88.176415  0.257901  7.727862  0.141809
22     past   88.176415  0.257901  8.046811  0.141809
23     past   94.037913  0.308439  8.541000  0.163651
24     past  101.630141  0.363136  8.416895  0.192256
25     past  101.630141  0.363136  6.531005  0.192256
26     past  101.630141  0.363136  6.397497  0.192256
27     past  101.630141  0.363136  6.500077  0.192256
28     past  101.630141  0.363136  7.088469  0.192256
29     past  101.630141  0.363136  7.821852  0.192256
30     past  101.630141  0.363136  8.011082  0.192256
31     past  101.037817  0.417099  8.279735  0.212376
32     past   88.176415  0.257901  3.416059  0.141809
33     past   88.176415  0.257901  4.835357  0.141809
34     past   88.176415  0.257901  5.238097  0.141809
35     past   88.176415  0.257901  5.535355  0.141809
36     past   88.176415  0.257901  6.479523  0.141809
37     past   88.176415  0.257901  7.727862  0.141809
38     past   88.176415  0.257901  8.046811  0.141809
39     past   94.037913  0.308439  8.541000  0.163651
40     past  101.630141  0.363136  8.416895  0.192256
41     past  101.630141  0.363136  6.531005  0.192256
42     past  101.630141  0.363136  6.397497  0.192256
43     past  101.630141  0.363136  6.500077  0.192256
44     past  101.630141  0.363136  7.088469  0.192256
45     past  101.630141  0.363136  7.821852  0.192256
46     past  101.630141  0.363136  8.011082  0.192256
47     past  101.037817  0.417099  8.279735  0.212376

and I plot it like this: 我这样绘制它:

df.boxplot(by='season')

在此处输入图片说明

How can I make sure that the different panels have different y axis min and max values? 如何确保不同的面板的y轴最小值和最大值不同? Also, how can I do this in seaborn? 另外,我该如何在seaborn中做到这一点?

OK, so the first thing you need is long-form data. 好的,因此您需要的第一件事是长格式数据。 Let's say you start with this: 假设您从此开始:

import numpy
import pandas
import seaborn
numpy.random.seed(0)

N = 100
seasons = ['winter', 'spring', 'summer', 'autumn']
df = pandas.DataFrame({
    'season': numpy.random.choice(seasons, size=N),
    'A': numpy.random.normal(4, 1.75, size=N),
    'B': numpy.random.normal(4, 4.5, size=N),
    'C': numpy.random.lognormal(0.5, 0.05, size=N),
    'D': numpy.random.beta(3, 1, size=N)
})

print(df.sample(7))

           A         B         C         D  season
85  7.236212  5.044815  1.845659  0.550943  autumn
13  4.749581  1.014348  1.707000  0.630618  autumn
0   1.014027  4.750031  1.637803  0.285781  winter
3   3.233370  8.250158  1.516189  0.973797  winter
44  6.062864 -0.969725  1.564768  0.954225  autumn
43  7.317806 -3.209259  1.699684  0.968950  spring
39  5.576446 -2.187281  1.735002  0.436692  winter

You get it into long-form data with the pandas.melt function. 您可以使用pandas.melt函数将其转换为长格式数据。

lf = pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
print(lf.sample(7))

     season variable     value
399  winter        D  0.238061
227  spring        C  1.656770
322  autumn        D  0.933299
121  autumn        B  4.393981
6    autumn        A  1.175679
5    autumn        A  5.360608
51   spring        A  5.709118

Then you can just pipe all that straight into seaborn.factorplot 然后,您可以将所有内容直接seaborn.factorplotseaborn.factorplot

fg = (
    pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
        .pipe(
            (seaborn.factorplot, 'data'), # (<fxn>, <dataframe var>)
            kind='box',                   # type of plot we want
            x='season', x_order=seasons,  # x-values of the plots
            y='value', palette='BrBG_r',  # y-values and colors
            col='variable', col_wrap=2,   # 'A-D' in columns, wrap at 2nd col
            sharey=False                  # tailor y-axes for each group
            notch=True, width=0.75,       # kwargs passed to boxplot
        )
)

And that gives me: 这给了我:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM