简体   繁体   English

R; 传感器数据的时间序列分析

[英]R; Time series analysis on sensor data

I have a data frame of sensor data 我有一个传感器数据的数据框

I have a data frame as follows: 我有一个数据框,如下所示:

pressure    datetime
4.848374    2016-04-12 10:04:00   
4.683901    2016-04-12 10:04:32   
5.237860    2016-04-12 10:13:20 

Now, I would like to apply ARIMA to make predictive analytics. 现在,我想应用ARIMA进行预测分析。

Since the data is not sampled uniformly, I have aggregated it on Hourly basis which looks as follows: 由于数据不是统一采样的,因此我按小时汇总,如下所示:

datetime                    pressure
"2016-04-19 00:00:00 BST"   5.581806
"2016-04-19 01:00:00 BST"   4.769832
"2016-04-19 02:00:00 BST"   4.769832  
"2016-04-19 03:00:00 BST"   4.553711  
"2016-04-19 04:00:00 BST"   6.285599  
"2016-04-19 05:00:00 BST"   5.873414

The pressure for every hour looks like below: 每小时的压力如下所示:

在此处输入图片说明

But I can't create ts object as I am not sure what the frequency should be for Hourly data. 但是我无法创建ts对象,因为我不确定每小时数据的频率。

Your question has already been answered in the comment section, but just to reiterate, you should set the frequency to 24, as you want to forecast the hourly data: 您的问题已经在评论部分得到了回答,但是为了重申这一点,您应该将频率设置为24,因为您希望预测每小时的数据:

sensor = ts(hourlyPressure, frequency = 24)

To your next point with regards to fixing the dates in your plot, lets start with some example data: 关于固定绘图中的日期,接下来要讲的是一些示例数据:

###Sequence of numbers to forecast    
hourlyPressure<-c(1:24, 12:36, 24:48, 36:60)
###Sequence of Accompanying Dates
dates<-seq(as.POSIXct("2016-04-19 00:00:00"), as.POSIXct("2016-04-23 02:00:00"), by="hour")

Now we can set the hourlyPressure data to be a ts() object (let's ignore the dates for a minute) 现在我们可以将hourlyPressure数据设置为ts()对象(让我们忽略日期一分钟)

sensor <- ts(hourlyPressure, frequency=24)

Now fit your arima model, in this example I will use the auto.arima function from the forecast package as finding the best arima model is not the focus of attention here (although using auto.arima() is a pretty robust way of finding the best arima model to fit your data): 现在适合您的Arima模型,在本示例中,我将使用预测包中的auto.arima函数,因为这里找不到关注的最佳Arima模型(尽管使用auto.arima()是查找模型的一种非常可靠的方法)适合您数据的最佳Arima模型):

###fit arima model to sensor data
sensor_arima_fit<- auto.arima(sensor)

You can then plot this data with the appropriate dates by just specifying the x value in the plot() function 然后,只需在plot()函数中指定x值,就可以用适当的日期绘制此数据

plot(y=sensor_arima_fit$x, x=dates)

A little more difficult is when we forecast our data and want to plot the original data, with the forecasts and have the dates correct. 当我们预测我们的数据并想要绘制原始数据,预测并正确设置日期时,会遇到一些困难。

###now forecast ahead (lets say 2 days) using the arima model that was fit above
forecast_sensor <- forecast(sensor_arima_fit, h = 48)

Now to plot the original data, forecasts with the correct dates, we can do the following: 现在要绘制原始数据,并以正确的日期进行预测,我们可以执行以下操作:

###set h to be the same as above
h <- c(48)
###calculate the dates for the forecasted values
forecasted_dates<-seq(dates[length(dates)]+(60*60)*(1), 
                  dates[length(dates)]+(60*60)*(h), by="hour")

###now plot the data
plot(y=c(forecast_sensor$x, forecast_sensor$mean), 
     x=seq(as.POSIXct("2016-04-19 00:00:00"),as.POSIXct(forecasted_dates[length(forecasted_dates)]), by="hour"),
     xaxt="n", 
     type="l", 
     main="Plot of Original Series and Forecasts", 
     xlab="Date", 
     ylab="Pressure")

###correctly formatted x axis
axis.POSIXct(1, at=seq(as.POSIXct("2016-04-19 00:00:00"), 
                       as.POSIXct(forecasted_dates[length(forecasted_dates)]), 
                       by="hour"), 
             format="%b %d", 
             tick = FALSE)

This plots the original data with the forecasts and the dates are correct. 这会绘制带有预测的原始数据,并且日期正确。 However, just like the forecast package provides, perhaps we want the forecasts to be in blue. 但是,就像预测软件包提供的那样,也许我们希望预测是蓝色的。

###keep same plot as before
plot(y=c(forecast_sensor$x, forecast_sensor$mean), 
     x=seq(as.POSIXct("2016-04-19 00:00:00"),as.POSIXct(forecasted_dates[length(forecasted_dates)]), by="hour"),
     xaxt="n", 
     type="l", 
     main="Plot of Original Series and Forecasts", 
     xlab="Date", 
     ylab="Pressure")

axis.POSIXct(1, at=seq(as.POSIXct("2016-04-19 00:00:00"), 
                       as.POSIXct(forecasted_dates[length(forecasted_dates)]), 
                       by="hour"), 
             format="%b %d", 
             tick = FALSE)

###This time however, lets add a different color line for the forecasts
lines(y=forecast_sensor$mean, x= forecasted_dates, col="blue")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM