简体   繁体   English

我如何 plot pandas dataframe 中列的中位数堆积条形图?

[英]How can I plot a stacked bar chart of median of a column in pandas dataframe?

So I am a newbie learning about data visualization in pandas (python), My task is to Create a stacked chart of median WeekHrs and CodeRevHrs for the age group 30 to 35.所以我是一个新手,在 pandas (python) 中学习数据可视化,我的任务是为 30 到 35 岁的年龄组创建中位数 WeekHrs 和 CodeRevHrs 的堆叠图表。

following is my code where I extracted the data applying filter on age column and below are the first five rows of my dataset以下是我的代码,我在其中提取了在年龄列上应用过滤器的数据,下面是我的数据集的前五行

age_filter= agework [(agework["age"]>= 30 )&(agework["age"]<=35)] 
median_weekhrs= age_filter["Weekhrs"].median()
median_coderev= age_filter["CodeRevHrs"].median()

age_filter.head()

    CodeRevHrs  Weekhrs age
5   3.0          8.0    31.0
11  2.0         40.0    34.0
12  2.0         40.0    32.0
18  15.0        42.0    34.0
22  2.0         40.0    33.0

How can I plot a stacked bar chart with a median?我怎样才能 plot 一个带有中位数的堆积条形图?

Please help请帮忙

First, to filter for age (and also convert age to int as it makes for cleaner labels):首先,过滤年龄(并将年龄转换为int ,因为它使标签更清晰):

df = agework.query('30 <= age <= 35')
df['age'] = df['age'].astype(int)

Then, you could plot a bar chart of the median of the two quantities in each age group:然后,您可以 plot 制作每个年龄组中两个数量的中位数的条形图:

df.groupby('age').median().plot.bar(stacked=True)
plt.title('Median hours, by age')

BTW, you can impose an arbitrary order in how the values are stacked.顺便说一句,您可以对值的堆叠方式施加任意顺序。 For example, if you'd rather have 'Weekhrs' at the bottom, you can say:例如,如果您希望在底部'Weekhrs' ,您可以说:

order = ['Weekhrs', 'CodeRevHrs']
df.groupby('age')[order].median().plot.bar(stacked=True)
plt.title('Median hours, by age')

Now, if you'd rather plot the overall median of these quantities for the entire filtered age range (as you say: a single number for each quantity), then one way (among many) would be:现在,如果您希望 plot 是整个过滤年龄范围内这些数量的总体中位数(如您所说:每个数量只有一个数字),那么一种方法(在许多中)将是:

label = f"{df['age'].min()}-{df['age'].max()}"
df.median().drop('age').to_frame(label).T.plot.bar(stacked=True)
plt.title(f'Median hours for age {label}')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM