简体   繁体   English

R预测和可视化

[英]R prediction and visualization

I have fitted a polynomial to my data and visualized the results. 我已经将多项式拟合到我的数据中并可视化结果。 I'm trying to extend my plot to future and predict the x value (date) when y is lower than 70. My data is HERE to replicate. 我正在尝试将我的情节扩展到未来,并在y低于70时预测x值(日期)。我的数据在这里可以复制。 My current code is below. 我目前的代码如下。

data <- read.table("data.txt", sep="\t", header=T)

data$date<- as.Date(data$date)
data$y <- as.numeric(data$y)

attach(data)

x <- 1:88 # vector for formula coordinates. I haven't found a way to plot polynomial formula with dates..

p <- qplot(date, y, data=data , geom="line", xlab="Time", ylab="y")
p+ geom_smooth(method = "lm", formula = y ~ poly(x, 3))


fit <- lm(y~poly(x,3)) 
summary(fit) #Fit is adequate

Which results to this plot: 这个结果的结果如下:

在此输入图像描述

The third order polynomial was made with numeric x vector because I didn't know how to use dates as "coordinates" for the formula. 三阶多项式是用数字x向量制作的,因为我不知道如何使用日期作为公式的“坐标”。 What I would like is to forecast ie extend this plot to the future and find out at what date is y lower than 70 using this formula. 我想要的是预测即将该图扩展到未来,并使用此公式找出y低于70的日期。

A bit hacky, but gets the job done: 有点hacky,但完成工作:

在此输入图像描述

Code

# Define timeframe to predict, convert dates to numeric
days <- as.numeric(seq.Date(max(df$date) + 1, max(df$date) + 120, by = "days"))

# Build model
model <- loess(y ~ as.numeric(date), df, control = loess.control(surface = "direct"))

# Apply model to timeframe
p <- predict(model, days)

# Convert date back to Date format, build result dataframe
result <- data.frame(date = as.Date(days, origin = "1970-01-01"),
                     y = p)

# Plot three elements: original data, model, prediction
ggplot() +  
    geom_line(data = df, aes(date, y)) +
    geom_smooth(data = df, aes(date, y), method = "loess", se = FALSE) + 
    geom_line(data = result, aes(date, y), linetype = "dashed", color = "red", size = 1)

Data 数据

df <- structure(list(date = structure(c(16166, 16167, 16168, 16169, 16170, 16171, 16172, 16173, 16174, 16175, 16176, 16177, 16178, 16179, 16180, 16186, 16187, 16188, 16189, 16190, 16191, 16205, 16206, 16207, 16208, 16209, 16210, 16211, 16212, 16216, 16217, 16218, 16219, 16261, 16262, 16263, 16264, 16265, 16266, 16267, 16268, 16269, 16270, 16271, 16272, 16273, 16274, 16275, 16282, 16283, 16284, 16285, 16286, 16287, 16288, 16289, 16290, 16291, 16292, 16293, 16294, 16295, 16296, 16297, 16298, 16299, 16300, 16301, 16302, 16303, 16304, 16305, 16306, 16307, 16308, 16309, 16310, 16311, 16312, 16313, 16315, 16316, 16317, 16318, 16319, 16320, 16321, 16322), class = "Date"), y = c(95.543962, 95.573412, 95.589183, 95.500536, 95.563371, 95.579541, 94.979131, 95.56979, 95.545374, 95.912162, 95.687874, 95.564335, 95.538733, 95.579036, 95.539545, 94.068515, 94.584192, 95.479851, 95.554502, 95.517236, 95.514891, 95.541116, 95.52134, 95.545067, 95.551372, 95.520105, 95.535395, 95.494109, 95.501609, 95.544039, 95.545912, 95.560667, 95.435162, 94.934045, 95.072639, 95.050748, 94.676876, 94.68793, 95.068279, 95.038642, 94.408982, 94.429949, 94.990296, 94.75853, 95.1649, 95.095966, 93.945934, 93.934546, 92.71179, 92.757176, 93.429478, 93.730306, 93.840446, 93.769516, 93.958374, 93.94293, 93.940904, 93.776711, 93.474757, 92.255233, 92.779808, 92.508432, 92.869858, 92.846158, 93.533357, 93.233847, 93.392017, 93.613915, 93.520494, 93.761786, 93.562945, 93.584771, 93.650417, 93.091347, 92.813293, 92.650896, 92.577961, 92.468491, 93.269589, 93.242729, 91.626408, 91.157243, 90.486782, 90.989062, 91.766393, 91.477911, 90.463049, 91.182974)), row.names = c(NA, -88L), class = "data.frame")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM