I have a Dataframe which tracks the number of stops with a column 'ethnicity' and I want to plot the count of stops of the different ethnicities per year. My df looks something like this.
Year | Ethnicity |
---|---|
2001 | black |
2001 | white |
2001 | black |
2002 | white |
2002 | white |
2002 | black |
2002 | white |
I would now like to plot the total number of stops per ethnicity per year. How would I separate the different ethnicities into different y axis.
Hope its clear:)
I was able to separate the different ethnicities into two different y axes using this code:
y_black = df[df['ethnicity'] == 'Black'].groupby(['year'])['ethnicity'].count().tolist()
y_white = df[df['ethnicity'] == 'White'].groupby(['year'])['ethnicity'].count().tolist()
Hope this helps somebody
You can group your data with groupby
and then count the number of occurrences.
from matplotlib import pyplot as plt
import pandas as pd
#fake data generation
import numpy as np
n = 50
np.random.seed(123)
df = pd.DataFrame({"A": np.random.choice([2001, 2002, 2004, 2007], size=n), "B": np.random.choice(list("XYZ"), size=n)})
#group df by both columns - >count elements in B for each A -> unstack the returned multiindex df for matplotlib
plot_df = df.groupby(["A", "B"]).B.count().unstack()
print(plot_df)
#pandas provides common plotting routines using matplotlib
plot_df.plot()
plt.show()
Sample output:
B X Y Z
A
2001 7 3 5
2002 1 6 4
2004 2 5 6
2007 4 3 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.