![](/img/trans.png)
[英]How to make a line plot from a dataframe with multiple categorical columns in matplotlib
[英]How to make stacked plot from the dataframe with categorical columns
我有一个 DataFrame:
loan_status Principal
244 PAIDOFF 1000
245 PAIDOFF 1000
246 PAIDOFF 1000
247 PAIDOFF 1000
248 PAIDOFF 1000
249 PAIDOFF 1000
250 PAIDOFF 800
252 PAIDOFF 1000
253 PAIDOFF 1000
254 PAIDOFF 1000
255 PAIDOFF 1000
256 PAIDOFF 800
257 PAIDOFF 1000
258 PAIDOFF 1000
259 PAIDOFF 1000
260 COLLECTION 1000
261 COLLECTION 1000
262 COLLECTION 800
263 COLLECTION 800
264 COLLECTION 800
265 COLLECTION 1000
266 COLLECTION 1000
我希望结果为
希望得到你的帮助 谢谢
pandas.DataFrame.groupby
:.count
聚合:import pandas as pd
import matplotlib.pyplot as plt
df.groupby(['Principal', 'loan_status'])['loan_status'].count().unstack().plot.bar(stacked=True)
plt.show()
.sum
:df.groupby(['Principal', 'loan_status'])['Principal'].sum().unstack().plot.bar(stacked=True)
plt.show()
.mean
聚合:df.groupby(['Principal', 'loan_status'])['Principal'].mean().unstack().plot.bar(stacked=True)
plt.show()
使用 pandas,您可以创建两个变量的交叉表,默认情况下为您提供计数。 如果其中一个变量是数值,则可以对其应用聚合 function。 可以直接从表中绘制堆积条形图,如下例所示,其中汇总了“Principal”值:
import pandas as pd # v 1.1.3
# Note that if the 'values' and 'aggfunc' arguments are omitted, the
# table will contain the counts
ctab = pd.crosstab(index=df['Principal'], columns=df['loan_status'],
values=df['Principal'], aggfunc='sum')
ctab.plot.bar(stacked=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.