简体   繁体   中英

Color Code Dataframe in Scatter Plot

Lets say I have the following dataframe:

X    Y     Category
1    2         A 
5    3         B 
-1   1         C 
7    0         A 
1    2         B 
...

I want to find a way to color code the output of df['X'] and df['Y'] depending on their category ( df['Category'] ).

I have tried this so far:

cm = pd.unique(df['Category'])
plt.scatter(data['X'], data['Y'], c=cm)

but it's telling me

c of shape (37,) not acceptable as a color sequence for x with size 67725, y with size 67725

It is much simpler to do this using a higher-level library such as seaborn , specifically through seaborn.lmplot :

import seaborn as sns

sns.lmplot(x=X, y=Y, huge='Category', data=df)

and let it take care of the details.

See Plotting With Categorical Data to see seaborn 's other options for plotting categorical data.

You can reshape your dataframe an use pandas plot.

df.set_index(['X','Category'])['Y'].unstack().plot(marker='o',linestyle='none')

Output:

在此处输入图片说明

Or you can use seaborn:

import seaborn as sns
_ = sns.pointplot(x='X',y='Y', hue='Category', data=df, linestyles='none')

Output:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM