简体   繁体   English

matplotlib / pyplot: plot 多图的平均曲线

[英]matplotlib / pyplot: plot average curve for multiple graphs

I'm a beginner in matplotlib.我是 matplotlib 的初学者。 I've already created some graphs from a dataframe.我已经从 dataframe 创建了一些图表。 Now I want to create the average graph from multiple graphs.现在我想从多个图表创建平均图表。

Can anyone help me with this?谁能帮我这个?

My graphs look like this:我的图表如下所示:

alb = df[df.location == 'Albania']
bel = df[df.location == 'Belarus']
bih = df[df.location == 'Bosnia and Herzegovina']
blg = df[df.location == 'Bulgaria']
cro = df[df.location == 'Croatia']
hun = df[df.location == 'Hungary']
kos = df[df.location == 'Kosovo']
mld = df[df.location == 'Moldova']
mng = df[df.location == 'Montenegro']
nmc = df[df.location == 'North Macedonia']
pol = df[df.location == 'Poland']
rom = df[df.location == 'Romania']
rus = df[df.location == 'Russia']
srb = df[df.location == 'Serbia']
tur = df[df.location == 'Turkey']
ukr = df[df.location == 'Ukraine']

plt.plot(alb.date, alb.new_deaths_per_million)
plt.plot(bel.date, bel.new_deaths_per_million)
plt.plot(bih.date, bih.new_deaths_per_million)
plt.plot(blg.date, blg.new_deaths_per_million)
plt.plot(cro.date, cro.new_deaths_per_million)
plt.plot(hun.date, hun.new_deaths_per_million)
plt.plot(kos.date, kos.new_deaths_per_million)
plt.plot(mld.date, mld.new_deaths_per_million)
plt.plot(mng.date, mng.new_deaths_per_million)
plt.plot(nmc.date, nmc.new_deaths_per_million)
plt.plot(pol.date, pol.new_deaths_per_million)
plt.plot(rom.date, rom.new_deaths_per_million)
plt.plot(rus.date, rus.new_deaths_per_million)
plt.plot(srb.date, srb.new_deaths_per_million)
plt.plot(tur.date, tur.new_deaths_per_million)
plt.plot(ukr.date, ukr.new_deaths_per_million)
plt.legend(['Albania', 'Belarus', 'Bosnia and Herzegovina', 'Bulgaria', 'Croatia', 'Hungary', 'Kosovo', 'Moldova', 'Montenegro', 'North Macedonia', 'Poland', 'Romania', 'Russia', 'Serbia', 'Turkey', 'Ukraine'])
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['figure.dpi'] = 200
plt.show()

The graphs look like this:图表如下所示:

在此处输入图像描述

This is the dataset I use:这是我使用的数据集:

url = 'https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv'
df = pd.read_csv(url,error_bad_lines=False)
pd.options.display.max_columns = None
display(df)

Now I would like to create the average curve from all these graphs.现在我想从所有这些图表中创建平均曲线。 Can someone help me with the code;有人可以帮我写代码吗? how to do this in the easiest way?如何以最简单的方式做到这一点?

Thanks in advance for the help!在此先感谢您的帮助!

Easiest would be to take the mean of the cases in all countries and plot this.最简单的方法是取所有国家/地区病例的平均值,plot 这个。 You can do this by creating a new DataFrame :您可以通过创建一个新的DataFrame来做到这一点:

frames = [alb, bel, bih, blg, cro, hun, kos, mld, mng, nmc, pol, rom, rus, srb, tur, ukr]
result = pd.concat(frames)[["date", "new_deaths_per_million"]] # Only select the relevant columns for now

result = result.groupby(["date"], as_index=False).mean() # Take the mean of all the values on the same date
print(result)

Output: Output:

date        new_deaths_per_million                  
2020-01-31                     NaN
2020-02-01                     NaN
2020-02-02                     NaN
2020-02-03                     NaN
2020-02-04                     NaN
...                            ...
2021-12-15                7.698063
2021-12-16                6.557625
2021-12-17                7.023750
2021-12-18                4.249562
2021-12-19                3.559250

You can then plot this as well - as you have done with all the countries.然后,您也可以使用 plot - 就像您对所有国家所做的那样。 Please note that you seem to have some missing data, resulting in NaN (not a number) values.请注意,您似乎缺少一些数据,导致NaN (不是数字)值。 You might want to clean the data a bit first.您可能需要先清理一下数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM