簡體   English   中英

使用 auto_arima(SARIMAX) 和傅里葉項預測具有多個季節性的時間序列

[英]Forecasting time series with multiple seasonaliy by using auto_arima(SARIMAX) and Fourier terms

我試圖通過使用 auto_arima 並添加傅里葉項作為外生特征來預測 Python 中的時間序列。 數據來自 kaggle 的 商店商品需求預測挑戰 它由 10 家商店和 50 個商品的長格式時間序列組成,從而導致 500 個時間序列相互堆疊。 該時間序列的特殊性在於它具有具有每周和每年季節性的每日數據。

為了捕捉這兩個級別的季節性,我首先使用了 Rob J Hyndman 在預測中推薦的 TBATS,其每日數據實際上效果很好。

我還關注了 TBATS python 庫的創建者發布的這篇中篇文章,他將它與 SARIMAX + Fourier 項(也是 Hyndman 推薦的)進行了比較。

但是現在,當我嘗試將第二種方法與 pmdarima 的 auto_arima 和 Fourier 項作為外生特征一起使用時,我得到了意想不到的結果。

在下面的代碼中,我只使用了我拆分為訓練數據和測試數據(去年用於預測)的 train.csv 文件,並設置了傅里葉項的最大階數 K = 2。

我的問題是我獲得了一個平滑的預測(見下圖),它似乎沒有捕捉到與本文末尾的結果不同的每周季節性。 我的代碼有問題嗎?

完整代碼:

# imports
import pandas as pd
from pmdarima.preprocessing import FourierFeaturizer
from pmdarima import auto_arima
import matplotlib.pyplot as plt

# Upload the data that consist in a long format time series of multiple TS stacked on top of each other
# There are 10 (stores) * 50 (items) = 500 time series
train_data = pd.read_csv('train.csv', index_col='date', parse_dates=True)

# Select only one time series for store 1 and item 1 for the purpose of the example
train_data = train_data.query('store == 1 and item == 1').sales

# Prepare the fourier terms to add as exogenous features to auto_arima
# Annual seasonality covered by fourier terms
four_terms = FourierFeaturizer(365.25, 2)
y_prime, exog = four_terms.fit_transform(train_data)
exog['date'] = y_prime.index # is exactly the same as manual calculation in the above cells
exog = exog.set_index(exog['date'])
exog.index.freq = 'D'
exog = exog.drop(columns=['date'])


# Split the time series as well as exogenous features data into train and test splits 
y_to_train = y_prime.iloc[:(len(y_prime)-365)]
y_to_test =  y_prime.iloc[(len(y_prime)-365):] # last year for testing

exog_to_train = exog.iloc[:(len(exog)-365)]
exog_to_test = exog.iloc[(len(exog)-365):]


# Fit model
# Weekly seasonality covered by SARIMAX
arima_exog_model = auto_arima(y=y_to_train, exogenous=exog_to_train, seasonal=True, m=7)

# Forecast
y_arima_exog_forecast = arima_exog_model.predict(n_periods=365, exogenous=exog_to_test)
y_arima_exog_forecast = pd.DataFrame(y_arima_exog_forecast , index = pd.date_range(start='2017-01-01', end= '2017-12-31'))


# Plots
plt.plot(y_to_test, label='Actual data')
plt.plot(y_arima_exog_forecast, label='Forecast')
plt.legend()

過去一年的實際數據和預測數據

提前感謝您的回答!

如果有人感興趣,這是答案。 再次感謝 Flavia Giammarino。

# imports
import pandas as pd
from pmdarima.preprocessing import FourierFeaturizer
from pmdarima import auto_arima
import matplotlib.pyplot as plt

# Upload the data that consists long format time series of multiple TS stacked on top of each other
# There are 10 (stores) * 50 (items) time series
train_data = pd.read_csv('train.csv', index_col='date', parse_dates=True)

# Select only one time series for store 1 and item 1 for the purpose of the example
train_data = train_data.query('store == 1 and item == 1').sales

# Prepare the fourier terms to add as exogenous features to auto_arima
# Annual seasonality covered by fourier terms
four_terms = FourierFeaturizer(365.25, 1)
y_prime, exog = four_terms.fit_transform(train_data)
exog['date'] = y_prime.index # is exactly the same as manual calculation in the above cells
exog = exog.set_index(exog['date'])
exog.index.freq = 'D'
exog = exog.drop(columns=['date'])


# Split the time series as well as exogenous features data into train and test splits 
y_to_train = y_prime.iloc[:(len(y_prime)-365)]
y_to_test =  y_prime.iloc[(len(y_prime)-365):] # last year for testing

exog_to_train = exog.iloc[:(len(exog)-365)]
exog_to_test = exog.iloc[(len(exog)-365):]


# Fit model
# Weekly seasonality covered by SARIMAX
arima_exog_model = auto_arima(y=y_to_train, D=1, exogenous=exog_to_train, seasonal=True, m=7)

# Forecast
y_arima_exog_forecast = arima_exog_model.predict(n_periods=365, exogenous=exog_to_test)
y_arima_exog_forecast = pd.DataFrame(y_arima_exog_forecast , index = pd.date_range(start='2017-01-01', end= '2017-12-31'))


# Plots
plt.plot(y_to_test, label='Actual data')
plt.plot(y_arima_exog_forecast, label='Forecast')
plt.legend()

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM