[英]How to make annotated grouped stacked barchart in matplotlib?
I have covid19 tracking time series data which I scraped from covid19 tracking site.我有从 covid19 跟踪站点上抓取的 covid19 跟踪时间序列数据。 I want to make an annotated grouped stacked barchart.
我想制作一个带注释的分组堆叠条形图。 To do so, I used
matplotlib
and seaborn
for making plot, I figured out plotting data to render the corresponding barchart.为此,我使用
matplotlib
和seaborn
制作 plot,我想出了绘制数据以呈现相应的条形图。 I tried plot annotation in SO
but didn't get the correct annotated plot. Also, I have some issues of getting grouped stacked barchart for the time series data.我在
SO
中尝试了 plot 注释,但没有得到正确的注释 plot。另外,我在为时间序列数据获取分组堆叠条形图时遇到了一些问题。 Can anyone suggest a possible way of doing this?任何人都可以建议这样做的可能方法吗? Any idea?
任何的想法?
my attempt我的尝试
here is the reproducible time series data that I scraped from covid19 tracking site:这是我从covid19跟踪网站上抓取的可重现时间序列数据:
import pandas as pd
from datetime import date
import matplotlib.pyplot as plt
import seaborn as sns
bigdf = pd.read_csv("coviddf.csv")
bigdf['run_date'] = pd.to_datetime(bigdf['run_date'])
for g, d in bigdf.groupby(['company']):
data = d.groupby(['run_date','county-state', 'company', 'est'], as_index=True).agg({'new': sum, 'confirmed': sum, 'death': sum}).stack().reset_index().rename(columns={'level_4': 'type', 0: 'val'})
print(f'{g}')
g = sns.FacetGrid(data, col='est', sharex=False, sharey=False, height=5, col_wrap=4)
g.map(sns.barplot, 'run_date', 'val', 'type', order=data.run_date.dt.date.unique(), hue_order=data['type'].unique())
g.add_legend()
g.set_xticklabels(rotation=90)
g.set(yscale='log')
plt.tight_layout()
plt.show()
I have a couple of issues from the above attempt.我从上面的尝试中遇到了几个问题。 I need to make grouped stacked barchart where each group is each different company, and each stack barchart is individual establishment (aka,
est
column in coviddf.csv
), so each company might have multiple establishments, so I want to see the number of new, confirmed and death covid19 cases in grouped stacked barchart.我需要制作分组堆叠条形图,其中每个组都是不同的公司,每个堆叠条形图都是单独的机构(也就是
coviddf.csv
中est
列),所以每个公司可能有多个机构,所以我想看看新的数量, 确诊和死亡 covid19 病例在分组堆叠条形图中。 Is there any way to make annotated grouped stacked barchart for this time series?有没有办法为这个时间序列制作带注释的分组堆叠条形图? Can anyone suggest a possible way of achieving this?
任何人都可以提出实现这一目标的可能方法吗? How to make these plots in one page?
如何在一页中制作这些图? Any idea?
任何的想法?
desired output希望 output
I tried to make grouped stacked barchart like this post and second related post did.我试着像这篇文章和第二篇相关文章那样制作分组堆叠条形图。 Here is the desired annotated grouped stacked barchart that I want to make:
这是我想要制作的所需带注释的分组堆叠条形图:
Can anyone point me out how to make this happen from above current attempt?谁能指出我如何从当前的尝试中实现这一点? Any thoughts about this?
对此有什么想法吗?
confirmed
is so large compared to the other values, that you will not be able to see new
and death
confirmed
与其他值相比太大,您将无法看到new
和death
company
& est
.company
和est
都有一个组。import pandas as pd
# load the data
df = pd.read_csv("https://gist.githubusercontent.com/jerry-shad/318595505684ea4248a6cc0949788d33/raw/31bbeb08f329b4b96605b8f2a48f6c74c3e0b594/coviddf.csv")
df.drop(columns=['Unnamed: 0'], inplace=True) # drop this extra column
# select columns and shape the dataframe
dfs = df.iloc[:, [2, 3, 4, 12, 13]].set_index(['company', 'est']).sort_index(level=0)
# display(dfs)
confirmed new death
company est
Agri Co. 235 10853 0 237
CS Packers 630 10930 77 118
Caviness 675 790 5 19
Central Valley 6063A 6021 44 72
FPL 332 5853 80 117
# plot
ax = dfs.plot.barh(figsize=(8, 25), width=0.8)
plt.xscale('log')
plt.grid(True)
plt.tick_params(labelbottom=True, labeltop=True)
plt.xlim(10**0, 1000000)
# annotate the bars
for rect in ax.patches:
# Find where everything is located
height = rect.get_height()
width = rect.get_width()
x = rect.get_x()
y = rect.get_y()
# The width of the bar is the count value and can used as the label
label_text = f'{width:.0f}'
label_x = x + width
label_y = y + height / 2
# don't include label if it's equivalently 0
if width > 0.001:
ax.annotate(label_text, xy=(label_x, label_y), va='center', xytext=(2, -1), textcoords='offset points')
new
and death
are barely visible compared to confirmed
. new
和death
与confirmed
相比几乎看不出来。dfs.plot.barh(stacked=True, figsize=(8, 15))
plt.xscale('log')
I had trouble finding info on how to create a GROUPED and STACKED bar chart in matplotlib and later Plotly.我在 matplotlib 和后来的 Plotly 中找不到有关如何创建 GROUPED 和 STACKED 条形图的信息时遇到了麻烦。
Here is my attempt at solving your problem (using Plotly):这是我尝试解决您的问题(使用 Plotly):
# Import packages
import pandas as pd
from datetime import date
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
# Load data (I used the raw GitHub link so that no local file download was required)
bigdf = pd.read_csv("https://gist.githubusercontent.com/jerry-shad/318595505684ea4248a6cc0949788d33/raw/31bbeb08f329b4b96605b8f2a48f6c74c3e0b594/coviddf.csv")
# Get all companies names and number of companies
allComp = np.unique(bigdf.company)
numComp = allCompanies.shape[0]
# For all the companies
for i in range(numComp):
# Grab company data and the names of the establishments for that company
comp = allComp[i]
compData = bigdf.loc[bigdf.company == comp]
estabs = compData.est.to_numpy().astype(str)
numEst = compData.shape[0]
# Grab the new, confirmed, and death values for each of the establishments in that company
newVals = []
confirmedVals = []
deathVals = []
for i in range(numEst):
estabData = compData.loc[compData.est == estabs[i]]
newVals.append(estabData.new.to_numpy()[0])
confirmedVals.append(estabData.confirmed.to_numpy()[0])
deathVals.append(estabData.death.to_numpy()[0])
# Load that data into a Plotly graph object
fig = go.Figure(
data=[
go.Bar(name='New', x=estabs, y=newVals, yaxis='y', offsetgroup=1),
go.Bar(name='Confirmed', x=estabs, y=confirmedVals, yaxis='y', offsetgroup=2),
go.Bar(name='Death', x=estabs, y=deathVals, yaxis='y', offsetgroup=3)
]
)
# Update the layout (add time, set x/y axis titles, and bar graph mode)
fig.update_layout(title='COVID Data for ' + comp, xaxis=dict(type='category'), xaxis_title='Establishment',
yaxis_title='Value', barmode='stack')
fig.show()
where the output is 16 separate Plotly graphs for each company (which are interactable, and you can turn on various traces, as scaling for new/confirmed/death values wasn't so easy).其中output 是每个公司的 16 个单独的 Plotly 图表(它们是可交互的,您可以打开各种轨迹,因为新/确认/死亡值的缩放并不那么容易)。 Each plot has all the establishments for that company in the x-axis, and the new/confirmed/death values for each establishment as a stacked bar chart.
每个 plot 在 x 轴上都有该公司的所有机构,每个机构的新/确认/死亡值作为堆叠条形图。
Here is an example plot:这是一个示例 plot:
I know this doesn't completely answer your question, but I hope you appreciate my effort:)我知道这并不能完全回答你的问题,但我希望你欣赏我的努力:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.