简体   繁体   中英

How to Color Specific Data Points on a Plot Based on Column Value in Pandas Dataframe

Let's say I have the following dataset:

d = {'Team': ['Duke', 'LSU'], 'Wins': [20, 18], 'Losses' : [5, 7], 'Conference' : ['ACC', 'SEC']}
df = pd.DataFrame(data=d)
df

    Team    Wins   Losses   Conference
0   Duke     20      5          ACC
1   LSU      18      7          SEC

Then I make a scatterplot of it

plt.plot(d['Losses'], d['Wins'], 'o')

I would like to color code my scatter plot by Conference. More specifically, I would like to only color SEC teams red, while all other data points are the default blue. Additionally, how would I go about coloring just Duke red, while every other datapoint is blue? I have about 200 teams in my dataset. How would I ago about doing this? Thanks!

IIUC you can try

import pandas as pd
import numpy as np

d = {'Team': ['Duke', 'LSU'], 
     'Wins': [20, 18], 
     'Losses' : [5, 7], 
     'Conference' : ['ACC', 'SEC']}
df = pd.DataFrame(data=d)

df["color"] = np.where(df["Conference"]=="SEC", "red", "blue")

df.plot(x='Losses', y='Wins', kind="scatter", color=df["color"]);

If then you want to use the same logic for Duke you just need to change the line with np.where accordingly.

Update For this particular case I think you should have a look at plotly

import plotly.express as px
px.scatter(df,x="Losses", y="Wins", color="Conference", hover_name="Team")

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM