I am using Pandas to plot a DataFrame which contains three types of columns: Interest, Gender, and Experience Points.
I want to bin the Experience points into specific ranges, and then group the DataFrame by the binned values, Interest, and Gender. I then want to plot the counts by Interest for a specific Gender (ex: Male).
Using the code below, I was able to get my desired plot, however, Pandas is incorrectly sorting the binned values on the x-axis (see the attached image of what I mean).
Notice when I print my DataFrame, the binned values are in correct order but in the graph, the binned values are incorrectly sorted.
Experience Points Interest Gender
(0, 8] Bike Female 9
Male 5
Hike Female 6
Male 10
Swim Female 7
Male 7
(8, 16] Bike Female 8
Male 3
Hike Female 4
Male 7
Swim Female 10
Male 4
(16, 24] Bike Female 4
Male 6
Hike Female 10
...
My code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import random
matplotlib.style.use('ggplot')
interest = ['Swim','Bike','Hike']
gender = ['Male','Female']
experience_points = np.arange(0,200)
df = pd.DataFrame({'Interest':[random.choice(interest) for x in range(1000)],
'Gender':[random.choice(gender) for x in range(1000)],
'Experience Points':[random.choice(experience_points) for x in range(1000)]})
bins = np.arange(0,136,8)
exp_binned = pd.cut(df['Experience Points'],np.append(bins,df['Experience Points'].max()+1))
exp_distribution = df.groupby([exp_binned,'Interest','Gender']).size()
# Printed dataframe has correct sorting by binned values
print exp_distribution
#Plotted dataframe has incorrect sorting of binned values
exp_distribution.unstack(['Gender','Interest'])['Male'].plot(kind='bar')
plt.show()
Troubleshooting Steps Tried:
Using plot(kind='bar',sort_columns=True)
does NOT fix the issue
Grouping by only binned values and then plotting DOES fix the issue, but then I am unable to group by Interest or Gender. For example the following works:
exp_distribution = df.groupby([exp_binned]).size()
exp_distribution.plot(kind='bar')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.