[英]Stacked bar chart from Pandas Dataframe
I have a dataframe 'dft' with two columns 'Month' (can be January through to December) and 'Expenditure' for that month.我有一个数据框“dft”,其中有两列“月”(可以是一月到十二月)和该月的“支出”。
I am attempting to create a stacked bar chart for this data, with the stacks represnting expenditure between 0 - 100;我正在尝试为这些数据创建一个堆积条形图,堆栈表示支出在 0 - 100 之间; 100 - 500 and 500+; 100 - 500 和 500+;
To sort the dataframe for these values I have written the following code.为了对这些值的数据框进行排序,我编写了以下代码。
small = dft[(dft['Expenditure'] < 100) & (dft['Expenditure'] > 0)]
medium = dft[(dft['Expenditure'] <= 500) & (dft['Expenditure'] >= 100)]
large = dft[(dft['Expenditure'] > 500)]
Is there a way I can then plot these dataframes in a stacked bar chart straight from Pandas?有没有办法可以直接从 Pandas 将这些数据框绘制在堆积条形图中? The chart would have an x axis of Month and y axis of expenditure.该图表将有一个月的 x 轴和支出的 y 轴。
Welcome to StackOverflow!欢迎使用 StackOverflow!
I tried to create a simple example (using the original given data) which solves your case.我尝试创建一个简单的示例(使用原始给定数据)来解决您的情况。 You should also have a look at the stacked_bar_chart in the documentation.您还应该查看文档中的stacked_bar_chart 。 To convert the months and "fill up" the data you can use the following approach:要转换月份并“填充”数据,您可以使用以下方法:
import numpy as np
import matplotlib.pyplot as plt
# given x data
x1 = ['January', 'October', 'November', 'December']
x2 = ['January', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
x3 = ['January', 'November', 'December']
# given y data
y1 = [2.0, 91.53, 16.7, 50.4]
y2 = [1240.3, 216.17, 310.77, 422.12, 513.53, 113.53, 377.249, 1179.41]
y3 = [15.6, 235.433, 574.45]
# save all months in a list
months = ['January',
'February',
'March',
'April',
'May',
'June',
'July',
'August',
'September',
'October',
'November',
'December']
monthsDict = {}
# assign in a dictionary a number for each month
# 'January' : 0, 'February' : 1
for i, val in enumerate(months):
monthsDict[val] = i
# this function converts the given datasets by you into full 12 months list
def to_full_list(x, y):
# initialize a list of floats with a length of 12
result = [0.0] * 12
# assign for each months in the list the value to the corresponding index in result
# x[0] = January, y[0] = 2.0 would be result[0] = 12.0
for i, val in enumerate(x):
result[monthsDict[val]] = y[i]
return result
# convert the given data into the right format
r1 = np.array(to_full_list(x1, y1))
r2 = np.array(to_full_list(x2, y2))
r3 = np.array(to_full_list(x3, y3))
# increase the width of the output to match the long month strings
plt.figure(figsize=(11, 6))
# plot each of the created datasets
# x axis: months; y axis: values
p3 = plt.bar(months, r3 + r2 + r1)
p2 = plt.bar(months, r2 + r1)
p1 = plt.bar(months, r1)
# display the plot
plt.show()
Turning my comment into an answer: Instead of splitting the dataframe, add a new column with the qualifier to stack (small, medium, large).将我的评论变成答案:不要拆分数据帧,而是添加一个带有限定符的新列到堆栈(小、中、大)。 Then pivot the frame by that new column and plot with stacked=True option.然后通过该新列旋转框架并使用stacked=True 选项绘图。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# some data
dft = pd.DataFrame({"month" : ['January', 'October', 'November', 'December', 'January',
'June', 'July', 'August', 'September', 'October',
'November', 'December', 'January', 'November', 'December'],
"expediture" : [2.0, 91.53, 16.7, 50.4, 1240.3, 216.17, 310.77, 422.12,
513.53, 113.53, 377.249, 1179.41, 156, 2354.33, 157.45]})
# possible labels / months
labels = ['small', 'medium', 'large']
months = pd.date_range('2014-01','2014-12', freq='MS').strftime("%B").tolist()
full = pd.DataFrame(columns=labels, index=months)
#quantize data
dft["quant"] = pd.cut(dft["expediture"], bins = [0,100,500,np.inf], labels=labels)
# pivot data
piv = dft.pivot(values='expediture', columns="quant", index = "month")
# update full with data to have all months/labels available, even if not
# present in original dataframe
full.update(piv)
full.plot.bar(stacked=True)
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.