![](/img/trans.png)
[英]How to make stacked bar plot of dataframe values as percentage in matplotlib/pandas
[英]How to make a stacked bar plot for percentage of classes per year
我需要使用此數據集(頭)制作一個堆疊條 plot:
data = {'model': ['A1', 'A6', 'A1', 'A4', 'A3'],
'year': [2017, 2016, 2016, 2017, 2019],
'price': [12500, 16500, 11000, 16800, 17300],
'transmission': ['Manual', 'Automatic', 'Manual', 'Automatic', 'Manual'],
'mileage': [15735, 36203, 29946, 25952, 1998],
'fuelType': ['Petrol', 'Diesel', 'Petrol', 'Diesel', 'Petrol'],
'tax': [150, 20, 30, 145, 145],
'mpg': [55.4, 64.2, 55.4, 67.3, 49.6],
'engineSize': [1.4, 2.0, 1.4, 2.0, 1.0]}
df = pd.DataFrame(data)
model year price transmission mileage fuelType tax mpg engineSize
0 A1 2017 12500 Manual 15735 Petrol 150 55.4 1.4
1 A6 2016 16500 Automatic 36203 Diesel 20 64.2 2.0
2 A1 2016 11000 Manual 29946 Petrol 30 55.4 1.4
3 A4 2017 16800 Automatic 25952 Diesel 145 67.3 2.0
4 A3 2019 17300 Manual 1998 Petrol 145 49.6 1.0
我想要 x 軸上的年份(1997-2021)和 y 軸上從 0 到 100 的數字代表百分比。 最后,我希望以每年的比例顯示三種不同的燃料類型; 汽油、柴油和混合動力。
我已經完成了以下計算來獲得我的百分比,每個燃料類型,每年,現在我需要把它放在一個圖表上:
fuel_percentage = round((my_data_frame.groupby(['year'])['fuelType'].value_counts()/my_data_frame.groupby('year')['fuelType'].count())*100, 2)
print(fuel_percentage)
這給了我以下結果:
year fuelType
1997 Petrol 100.00
1998 Petrol 100.00
2002 Petrol 100.00
2003 Diesel 66.67
Petrol 33.33
2004 Petrol 80.00
Diesel 20.00
2005 Petrol 71.43
Diesel 28.57
2006 Petrol 66.67
Diesel 33.33
2007 Petrol 56.25
Diesel 43.75
2008 Diesel 66.67
Petrol 33.33
etc...
我主要擔心的是,由於 object 不是 dataframe,我將無法使用它來制作 plot。
這是我想要的 plot 類型的示例(用燃料類型替換玩家,用百分比替換 y 軸):
謝謝您的幫助!
python 3.8.11
、 pandas 1.3.3
、 matplotlib 3.4.3
中測試.groupby
& .unstack
pandas.DataFrame.groupby
creates a long dataframe that must be unstacked to a wide form, to easily work with the plotting APIimport pandas as pd
# I'm not a fan of this option because it requires doing .groupby twice
# calculate percent with groupby
dfc = (df.groupby(['year'])['fuelType'].value_counts() / df.groupby('year')['fuelType'].count()).mul(100).round(1)
# unstack the long dataframe
dfc = dfc.unstack(level=1)
.groupby
與.value_counts
和.unstack
dfc = df.groupby(['year'])['fuelType'].value_counts(normalize=True).mul(100).round(1).unstack(level=1)
.crosstab
pandas.crosstab
直接創建一個寬 dataframe# get the normalized value counts by index
dfc = pd.crosstab(df.year, df.fuelType, normalize='index').mul(100).round(1)
pandas.DataFrame.plot
with kind='bar'
and stacked=True
, or with kind='area'
.# display(dfc)
fuelType Diesel Petrol
year
2016 50.0 50.0
2017 50.0 50.0
2019 0.0 100.0
# plot bar
ax = dfc.plot(kind='bar', ylabel='Percent(%)', stacked=True, rot=0, figsize=(10, 4))
xticks=dfc.index
以使繪圖 API 在 x 軸上有更多值。# plot area
ax = dfc.plot(kind='area', ylabel='Percent(%)', rot=0, figsize=(10, 4), xticks=dfc.index)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.