简体   繁体   English

Python-ggplot:将移动平均线添加到 plot

[英]Python-ggplot: Adding moving average to plot

p = ggplot(cases, aes(x="Specimen date", y="Daily lab-confirmed cases", group = 1)) + geom_point() + geom_line() + labs(title = "Daily COVID-19 Cases")
p.save(filename = date_today, height=5, width=15, units = 'in', dpi=1000)

This is my current code to plot a graph from a DataFrame containing COVID-19 cases in England, which is then saved.这是我当前的 plot 代码,来自 DataFrame 的图表,其中包含英格兰的 COVID-19 病例,然后保存。 I'm trying to add a trend line that is similar to the Worldometer graphs (as shown below).我正在尝试添加类似于 Worldometer 图表的趋势线(如下所示)。

I cannot post images yet, so I will provide the example here .我还不能发布图片,所以我将在此处提供示例。

This is what my graph currently looks like.这就是我的图表目前的样子。

I am trying to achieve the '3-day moving average' and the '7-day moving average' .我正在努力实现“3 天移动平均线”“7 天移动平均线”

See stat_smooth , you can smooth using a moving average.请参阅stat_smooth ,您可以使用移动平均线进行平滑处理。

For example you may end up adding code like例如,您最终可能会添加如下代码

+ stat_smooth(method='mavg', method_args={'window': 3}, color='cyan')
+ stat_smooth(method='mavg', method_args={'window': 7}, color='blue')

But this will not give you a legend, because the moving average is not a variable (with a corresponding value) in the dataframe, ie given what you want to plot the data is not tidy.但这不会给你一个传说,因为移动平均线不是 dataframe 中的变量(具有相应的值),即给定你想要的 plot 数据不整齐。 So if you want a legend you will have to compute the moving average, create a tidy dataframe and plot the computed averages that are in tidy form.因此,如果你想要一个图例,你将不得不计算移动平均值,创建一个整洁的 dataframe和 plot 计算的平均值,这些平均值是整齐的。

How?如何? Use pandas.melt eg使用pandas.melt例如

# Compute moving averages
cases['mavg_3'] = cases['Daily lab-confirmed cases'].rolling(window=3).mean()
cases['mavg_7'] = cases['Daily lab-confirmed cases'].rolling(window=7).mean()

# Convert data Long/tidy format
cases_long = cases.melt(
    id_vars=['Specimen date', 'Daily lab-confirmed cases'],
    value_vars=['mavg_3', 'mavg_7'],
    var_name='mavg_type',
    value_name='mavg_value'
)

# Plot tidy data
(ggplot(cases_long, aes(x="Specimen date", y="Daily lab-confirmed cases"))
 + geom_point()
 + geom_line(aes(y='mavg_value', color='mavg_type'), na_rm=True)
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM