简体   繁体   English

如何在 R 中的时间序列上 plot 多项式回归线?

[英]How to plot a polynomial regression line on a time series in R?

I have used time series in R for data analysis occasionally, but I am not familiar with plotting with functions like ARIMA.我偶尔使用 R 中的时间序列进行数据分析,但我不熟悉使用 ARIMA 等函数进行绘图。

The following question stems from a comment on the number of daily cases of COVID in the US following a cubic.以下问题源于对美国每日 COVID 病例数的评论。 Indeed it looks like that, and I wanted to simply run a cubic regression with the humble (and frivolous) intent of plotting a polynomial curve on the scatterplot.确实看起来是这样,我想简单地运行三次回归,其目的是在散点图上绘制多项式曲线。 Being that it is a time series I don't think using the lm() function would work.由于这是一个时间序列,我认为使用lm() function 不会起作用。

Here is the code:这是代码:

options(repr.plot.width=14, repr.plot.height=10)
 
install.packages('RCurl')
require(repr) # Enables resizing of the plots.
require(RCurl)
require(foreign)
require(tidyverse) # To tip the df from long row of dates to cols (pivot_longer())

# Extracting the number of confirmed cummulative cases by country from the Johns Hopkins website:
 
x = getURL("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
corona <- read.csv(textConnection(x))
 
corona = (read_csv(x)
          %>% pivot_longer(cols = -c(`Province/State`, `Country/Region`, Lat, Long),
                           names_to = "date",
                           values_to = "cases")
          %>% select(`Province/State`,`Country/Region`, date, cases)
          %>% mutate(date=as.Date(date,format="%m/%d/%y"))
          %>% drop_na(cases)
          %>% rename(country="Country/Region", provinces="Province/State")
)
 
cc <- (corona
       %>% filter(country %in% c("US"))
)
 
ccw <- (cc
        %>% pivot_wider(names_from="country",values_from="cases")
        %>% filter(US>5)
)

first.der<-diff(ccw$US, lag = 1, differences = 1)

plot(ccw$date[2:length(ccw$date)-1], first.der, 
     pch = 19, cex = 1.2,
     ylab='', 
     xlab='',
     main ='Daily COVID-19 cases in US',
     col="firebrick",
     axes=FALSE,
     cex.main=1.5)
abline(h=0)
abline(v=ccw$date[length(ccw$date)-1], col='gray90')
abline(h=first.der[length(ccw$date)-1], col='firebrick', lty=2, lwd=.5)

at1 <- seq(min(ccw$date), max(ccw$date), by=2);
axis.Date(1, at=at1, format="%b %d", las=2, cex.axis=0.7)
axis(side=2, seq(min(first.der),max(first.der),1000), 
     las=2, cex.axis=1)

在此处输入图像描述

For the intended polynomial regression we just regress on the index and it's polynomials.对于预期的多项式回归,我们只对索引及其多项式进行回归。 For the polynomials we conveniently use poly and plot the fitted values with lines .对于多项式,我们方便地使用poly和 plot 拟合值与lines However, it appears that the cases rather follow a quartic curve than a cubic.但是,这些案例似乎遵循四次曲线而不是三次曲线。

ccw$first.der <- c(NA, diff(ccw$US))  ## better add an NA and integrate in data frame
ccw$index <- 1:length(ccw$US)

fit3 <- lm(first.der ~ poly(index , 3, raw=TRUE), ccw)  ## cubic
fit4 <- lm(first.der ~ poly(index , 4, raw=TRUE), ccw)  ## quartic

plot(first.der, main="US covid-19", xaxt="n")
tck <- c(1, 50, 100, 150)
axis(1, tck, labels=FALSE)
mtext(ccw$date[tck], 1, 1, at=tck)
lines(fit3$fitted.values, col=3, lwd=2)
lines(fit4$fitted.values, col=2, lwd=2)
legend("topleft", c("cubic", "quartic"), lwd=2, col=3:2)

在此处输入图像描述

I wasn't able to download your data, so I included an example using the mtcars dataset.我无法下载您的数据,因此我提供了一个使用mtcars数据集的示例。 You can use poly() or I() to obtain a polynomial regression:您可以使用poly()I()获得多项式回归:

set.seed(123)

qubic_model <- lm(mpg ~ hp + I(hp^2) + I(hp^3), data = mtcars)
min_hp <- min(mtcars$hp)
max_hp <- max(mtcars$hp)
grid_hp <- seq(min_hp, max_hp, by = 0.1)
qubic_model_line <- predict(qubic_model, data.frame(hp = grid_hp, `I(hp^2)` = grid_hp^2, `I(hp^3)` = grid_hp^3))

plot(mtcars$hp, mtcars$mpg, col='red',main='mpg vs hp', xlab='hp', ylab = 'mpg', pch=16)
lines(grid_hp, qubic_model_line, col='green', lwd = 3, pch=18)
legend(80, 15, legend=c("Data", "Cubic fit"),
       col=c("red", "green"), pch=c(16,18), cex=0.8)

If you just want to include an illustration for a trend, you can just use the local polynomial regression, eg, the LOESS method used by ggplot2 .如果您只想包含趋势的说明,则可以使用局部多项式回归,例如ggplot2使用的 LOESS 方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM