简体   繁体   中英

Grouped boxplots in pandas and seaborn

I have the foll. dataframe:

     season           A         B         C         D
0   current   26.978912  0.039233  1.248607  0.025874
1   current   26.978912  0.039233  0.836786  0.025874
2   current   26.978912  0.039233  3.047536  0.025874
3   current   26.978912  0.039233  3.726964  0.025874
4   current   26.978912  0.039233  1.171393  0.025874
5   current   26.978912  0.039233  0.180929  0.025874
6   current   26.978912  0.039233  0.000000  0.025874
7   current   34.709560  0.039233  0.700893  0.025874
8   current  111.140200  0.306142  3.068286  0.169244
9   current  111.140200  0.306142  2.931107  0.169244
10  current  111.140200  0.306142  2.121893  0.169244
11  current  111.140200  0.306142  1.479464  0.169244
12  current  111.140200  0.306142  2.186821  0.169244
13  current  111.140200  0.306142  9.542714  0.169244
14  current  111.140200  0.306142  9.890750  0.169244
15  current  111.140200  0.306142  8.864857  0.169244
16     past   88.176415  0.257901  3.416059  0.141809
17     past   88.176415  0.257901  4.835357  0.141809
18     past   88.176415  0.257901  5.238097  0.141809
19     past   88.176415  0.257901  5.535355  0.141809
20     past   88.176415  0.257901  6.479523  0.141809
21     past   88.176415  0.257901  7.727862  0.141809
22     past   88.176415  0.257901  8.046811  0.141809
23     past   94.037913  0.308439  8.541000  0.163651
24     past  101.630141  0.363136  8.416895  0.192256
25     past  101.630141  0.363136  6.531005  0.192256
26     past  101.630141  0.363136  6.397497  0.192256
27     past  101.630141  0.363136  6.500077  0.192256
28     past  101.630141  0.363136  7.088469  0.192256
29     past  101.630141  0.363136  7.821852  0.192256
30     past  101.630141  0.363136  8.011082  0.192256
31     past  101.037817  0.417099  8.279735  0.212376
32     past   88.176415  0.257901  3.416059  0.141809
33     past   88.176415  0.257901  4.835357  0.141809
34     past   88.176415  0.257901  5.238097  0.141809
35     past   88.176415  0.257901  5.535355  0.141809
36     past   88.176415  0.257901  6.479523  0.141809
37     past   88.176415  0.257901  7.727862  0.141809
38     past   88.176415  0.257901  8.046811  0.141809
39     past   94.037913  0.308439  8.541000  0.163651
40     past  101.630141  0.363136  8.416895  0.192256
41     past  101.630141  0.363136  6.531005  0.192256
42     past  101.630141  0.363136  6.397497  0.192256
43     past  101.630141  0.363136  6.500077  0.192256
44     past  101.630141  0.363136  7.088469  0.192256
45     past  101.630141  0.363136  7.821852  0.192256
46     past  101.630141  0.363136  8.011082  0.192256
47     past  101.037817  0.417099  8.279735  0.212376

and I plot it like this:

df.boxplot(by='season')

在此处输入图片说明

How can I make sure that the different panels have different y axis min and max values? Also, how can I do this in seaborn?

OK, so the first thing you need is long-form data. Let's say you start with this:

import numpy
import pandas
import seaborn
numpy.random.seed(0)

N = 100
seasons = ['winter', 'spring', 'summer', 'autumn']
df = pandas.DataFrame({
    'season': numpy.random.choice(seasons, size=N),
    'A': numpy.random.normal(4, 1.75, size=N),
    'B': numpy.random.normal(4, 4.5, size=N),
    'C': numpy.random.lognormal(0.5, 0.05, size=N),
    'D': numpy.random.beta(3, 1, size=N)
})

print(df.sample(7))

           A         B         C         D  season
85  7.236212  5.044815  1.845659  0.550943  autumn
13  4.749581  1.014348  1.707000  0.630618  autumn
0   1.014027  4.750031  1.637803  0.285781  winter
3   3.233370  8.250158  1.516189  0.973797  winter
44  6.062864 -0.969725  1.564768  0.954225  autumn
43  7.317806 -3.209259  1.699684  0.968950  spring
39  5.576446 -2.187281  1.735002  0.436692  winter

You get it into long-form data with the pandas.melt function.

lf = pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
print(lf.sample(7))

     season variable     value
399  winter        D  0.238061
227  spring        C  1.656770
322  autumn        D  0.933299
121  autumn        B  4.393981
6    autumn        A  1.175679
5    autumn        A  5.360608
51   spring        A  5.709118

Then you can just pipe all that straight into seaborn.factorplot

fg = (
    pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
        .pipe(
            (seaborn.factorplot, 'data'), # (<fxn>, <dataframe var>)
            kind='box',                   # type of plot we want
            x='season', x_order=seasons,  # x-values of the plots
            y='value', palette='BrBG_r',  # y-values and colors
            col='variable', col_wrap=2,   # 'A-D' in columns, wrap at 2nd col
            sharey=False                  # tailor y-axes for each group
            notch=True, width=0.75,       # kwargs passed to boxplot
        )
)

And that gives me:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM