简体   繁体   中英

Seaborn categorical plot with hue from DataFrame rows

I have this pandas DataFrame:

>>> print(df)
Channel     0     1     2     3     4     5     6     7
Sample                                                 
7d       3.82  4.10  3.86  3.86  3.95  3.65  3.43  3.63
12d      2.97  4.32  3.50  3.58  3.22  3.37  3.58  3.78
17d      4.01  4.04  4.10  3.43  3.76  3.26  3.35  3.48
DO       3.07  3.58  3.14  3.22  3.11  3.09  3.16  3.16

I want to do a plot similar to this (the code is sns.swarmplot(df) ):

在此处输入图片说明

But the colors should be set not per-channel (ie DataFrame column) but per-sample (ie DataFrame rows). So each "category" on the x-axis will have 4 colors corresponding to the rows 7d, 12d, 17d and DO.

Is there an easy way to accomplish this in seaborn?

EDIT : I should add that I tried using the hue keyword, but it says it requires using also x and y keyword. According to this example seems that I need to create a new DataFrame with all numeric values in one column and two other columns with sample and channel information. Then I can call the plot as sns.swarmplot(x='Channel', y='values', hue='Sample') . Is there a more direct way that does not involve creating an additional ad-hoc DataFrame?

EDIT2 : Following @BrenBarn suggestion, I end up creating a new "tidy" DataFrame with:

dd = []
for sa in df.index:
    print(sa)
    d = pd.DataFrame(df.loc[sa]).reset_index()
    d.columns = ['Channel', 'Leakage']
    d['Sample'] = sa
    dd.append(d)
ddf = pd.concat(dd)

And then plotting the data with:

sns.swarmplot(x='Channel', y='Leakage', hue='Sample', data=ddf)

which gives the plot I expected:

在此处输入图片说明

I was hoping there was a way to tell seaborn to use original "2-D table" format to do the plot which is much more compact and natural for this kind of data. If this is possible I would accept the answer ;).

You've basically answered your question in the edit, but you may want to look at pd.melt or pd.stack as an easier way of creating your new tidy DataFrame.

eg

s=df.stack()
s.name='values'
df_tidy=s.reset_index()
sns.stripplot(data=df_tidy,hue='sample',x='Channel',y='values')

or

df_tidy=pd.melt(df.reset_index(),id_vars=['sample'],value_vars=df.columns.tolist(),value_name='values')
sns.stripplot(data=df_tidy,hue='sample',x='Channel',y='values')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM