Matplotlib plotting different lines from one column in Dataframe

Question

I have a Dataframe which tracks the number of stops with a column 'ethnicity' and I want to plot the count of stops of the different ethnicities per year. My df looks something like this.

Year	Ethnicity
2001	black
2001	white
2001	black
2002	white
2002	white
2002	black
2002	white

I would now like to plot the total number of stops per ethnicity per year. How would I separate the different ethnicities into different y axis.

Hope its clear:)

Answer 1

I was able to separate the different ethnicities into two different y axes using this code:

y_black = df[df['ethnicity'] == 'Black'].groupby(['year'])['ethnicity'].count().tolist()
y_white = df[df['ethnicity'] == 'White'].groupby(['year'])['ethnicity'].count().tolist()

Hope this helps somebody

Answer 2

You can group your data with groupby and then count the number of occurrences.

from matplotlib import pyplot as plt
import pandas as pd

#fake data generation 
import numpy as np
n = 50
np.random.seed(123)
df = pd.DataFrame({"A": np.random.choice([2001, 2002, 2004, 2007], size=n), "B": np.random.choice(list("XYZ"), size=n)})

#group df by both columns - >count elements in B for each A -> unstack the returned multiindex df for matplotlib
plot_df = df.groupby(["A", "B"]).B.count().unstack()

print(plot_df)

#pandas provides common plotting routines using matplotlib
plot_df.plot()
plt.show()

Sample output:

B     X  Y  Z
A            
2001  7  3  5
2002  1  6  4
2004  2  5  6
2007  4  3  4

Matplotlib plotting different lines from one column in Dataframe

Question

2 answers

solution1
0 2020-12-17 11:11:47

solution2
0 ACCPTED 2020-12-17 11:26:02

Matplotlib plotting different lines from one column in Dataframe

Question

2 answers

solution1 0 2020-12-17 11:11:47

solution2 0 ACCPTED 2020-12-17 11:26:02

solution1
0 2020-12-17 11:11:47

solution2
0 ACCPTED 2020-12-17 11:26:02