简体   繁体   中英

Matplotlib plotting different lines from one column in Dataframe

I have a Dataframe which tracks the number of stops with a column 'ethnicity' and I want to plot the count of stops of the different ethnicities per year. My df looks something like this.

Year Ethnicity
2001 black
2001 white
2001 black
2002 white
2002 white
2002 black
2002 white

I would now like to plot the total number of stops per ethnicity per year. How would I separate the different ethnicities into different y axis.

Hope its clear:)

I was able to separate the different ethnicities into two different y axes using this code:

y_black = df[df['ethnicity'] == 'Black'].groupby(['year'])['ethnicity'].count().tolist()
y_white = df[df['ethnicity'] == 'White'].groupby(['year'])['ethnicity'].count().tolist()

Hope this helps somebody

You can group your data with groupby and then count the number of occurrences.

from matplotlib import pyplot as plt
import pandas as pd

#fake data generation 
import numpy as np
n = 50
np.random.seed(123)
df = pd.DataFrame({"A": np.random.choice([2001, 2002, 2004, 2007], size=n), "B": np.random.choice(list("XYZ"), size=n)})

#group df by both columns - >count elements in B for each A -> unstack the returned multiindex df for matplotlib
plot_df = df.groupby(["A", "B"]).B.count().unstack()

print(plot_df)

#pandas provides common plotting routines using matplotlib
plot_df.plot()
plt.show()

Sample output:

B     X  Y  Z
A            
2001  7  3  5
2002  1  6  4
2004  2  5  6
2007  4  3  4

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM