I'm a beginner in matplotlib. I've already created some graphs from a dataframe. Now I want to create the average graph from multiple graphs.
Can anyone help me with this?
My graphs look like this:
alb = df[df.location == 'Albania']
bel = df[df.location == 'Belarus']
bih = df[df.location == 'Bosnia and Herzegovina']
blg = df[df.location == 'Bulgaria']
cro = df[df.location == 'Croatia']
hun = df[df.location == 'Hungary']
kos = df[df.location == 'Kosovo']
mld = df[df.location == 'Moldova']
mng = df[df.location == 'Montenegro']
nmc = df[df.location == 'North Macedonia']
pol = df[df.location == 'Poland']
rom = df[df.location == 'Romania']
rus = df[df.location == 'Russia']
srb = df[df.location == 'Serbia']
tur = df[df.location == 'Turkey']
ukr = df[df.location == 'Ukraine']
plt.plot(alb.date, alb.new_deaths_per_million)
plt.plot(bel.date, bel.new_deaths_per_million)
plt.plot(bih.date, bih.new_deaths_per_million)
plt.plot(blg.date, blg.new_deaths_per_million)
plt.plot(cro.date, cro.new_deaths_per_million)
plt.plot(hun.date, hun.new_deaths_per_million)
plt.plot(kos.date, kos.new_deaths_per_million)
plt.plot(mld.date, mld.new_deaths_per_million)
plt.plot(mng.date, mng.new_deaths_per_million)
plt.plot(nmc.date, nmc.new_deaths_per_million)
plt.plot(pol.date, pol.new_deaths_per_million)
plt.plot(rom.date, rom.new_deaths_per_million)
plt.plot(rus.date, rus.new_deaths_per_million)
plt.plot(srb.date, srb.new_deaths_per_million)
plt.plot(tur.date, tur.new_deaths_per_million)
plt.plot(ukr.date, ukr.new_deaths_per_million)
plt.legend(['Albania', 'Belarus', 'Bosnia and Herzegovina', 'Bulgaria', 'Croatia', 'Hungary', 'Kosovo', 'Moldova', 'Montenegro', 'North Macedonia', 'Poland', 'Romania', 'Russia', 'Serbia', 'Turkey', 'Ukraine'])
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['figure.dpi'] = 200
plt.show()
The graphs look like this:
This is the dataset I use:
url = 'https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv'
df = pd.read_csv(url,error_bad_lines=False)
pd.options.display.max_columns = None
display(df)
Now I would like to create the average curve from all these graphs. Can someone help me with the code; how to do this in the easiest way?
Thanks in advance for the help!
Easiest would be to take the mean of the cases in all countries and plot this. You can do this by creating a new DataFrame
:
frames = [alb, bel, bih, blg, cro, hun, kos, mld, mng, nmc, pol, rom, rus, srb, tur, ukr]
result = pd.concat(frames)[["date", "new_deaths_per_million"]] # Only select the relevant columns for now
result = result.groupby(["date"], as_index=False).mean() # Take the mean of all the values on the same date
print(result)
Output:
date new_deaths_per_million
2020-01-31 NaN
2020-02-01 NaN
2020-02-02 NaN
2020-02-03 NaN
2020-02-04 NaN
... ...
2021-12-15 7.698063
2021-12-16 6.557625
2021-12-17 7.023750
2021-12-18 4.249562
2021-12-19 3.559250
You can then plot this as well - as you have done with all the countries. Please note that you seem to have some missing data, resulting in NaN
(not a number) values. You might want to clean the data a bit first.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.