繁体   English   中英

如何使用 statsmodels 的 ARMA 来预测外生变量?

[英]How to use statsmodels' ARMA to predict with exogenous variables?

我正在尝试使用 statsmodels 来拟合具有外生变量的 AR(MA) 过程。 为此,我生成了一个带有延迟外生变量的 AR(0) 过程的实现,我正在尝试恢复我对它的期望。 我能够正确地适应这个过程,但我无法使用predict方法。

以下代码是我想要实现的 MCVE。 它被大量评论,以便您可以轻松地关注它。 最后一个声明是一个失败的断言,我想让它通过。 我怀疑罪魁祸首是我如何调用函数.predict

import numpy as np
import statsmodels.tsa.api


def _transform_x(x, lag):
    """
    Converts a set of time series into a matrix of delayed signals.
    For x.shape[0] == 1, it is equivalent to call `statsmodels.tsa.api.lagmat(x_i, lag)`.

    For x.shape[0] == 1, each `row_j` is each time `t`, `column_i` is the signal at `t - i`,
    It assumes that no past signal => no signal: each row is left-padded with zeros.

    For example, for lag=3, the matrix would be:
    ```
    [0, 0   , 0   ] (-> y[0])
    [0, 0   , x[0]] (-> y[1])
    [0, x[0], x[1]] (-> y[2])
    ```

    I.e.
    The parameter fitted to column 2, a2, is the influence of `x[t - 1]` on `y[t]`.
    The parameter fitted to column 1, a1, is the influence of `x[t - 2]` on `y[t]`.
    It assumes that we only measure x[t] when we measure y[t], the reason why that column does not appear.

    For x.shape[0] > 1, it returns a concatenation of each of the matrixes for each signal.
    """
    for x_i in x:
        assert len(x_i) >= lag
        assert len(x_i.shape) == 1, 'Each of the elements must be a time-series (1D)'
    return np.concatenate(tuple(statsmodels.tsa.api.lagmat(x_i, lag) for x_i in x), axis=1)


# build the realization of the process y[t] = 1*x[t-2] + noise, where x[t] is iid from N(1,1)
t = np.arange(0, 1000, 1)
np.random.seed(1)

# the exogenous variable
x1 = 1 + np.random.normal(size=t.shape)

# this shifts x by 2 (puts the last element in the beginning, we set the beginning to 0)
y = np.roll(x1, 2) + np.random.normal(scale=0.01, size=t.shape)
y[0] = y[1] = 0

x = np.array([x1])  # x.shape[0] => each exogenous variable; x.shape[1] => each time point

# fit it with AR(2) + exogenous(2)
lag = 2

result = statsmodels.tsa.api.ARMA(y, (lag, 0), exog=_transform_x(x, lag)).fit(disp=False)

# this gives the expected. Specifically, `x2 = 0.9952` and all others are indistinguishable from 0.
# (x2 here means the highest delay, 2).
print(result.summary())

# predict 1 element out-of-sample. Because the process is y[t] = x[0, t - 2] + noise,
# the prediction should be equal to `x[0, -2]`
y_pred = result.predict(len(y), len(y), exog=_transform_x(x[:, -3:], lag))[0]

# this fails!
np.testing.assert_almost_equal(y_pred, x[0, -2], decimal=2)

有两个问题,据我所知

exog=_transform_x(x[:, -3:], lag)在 predict 中有初始值问题并且包括零而不是滞后。

索引:y[-1] 的预测应该是 x[-3],即落后两个。 如果我们想预测下一次观测,那么我们需要一个对应于预测期的扩展 exog x 数组。

如果我改变了这一点,那么 y[-1] 的断言就会传递给我:

>>> y_pred = result.predict(len(y)-1, len(y)-1, exog=_transform_x(x[:, -10:], lag)[-1])
>>> y_pred
>>> array([ 0.9308579])
>>> result.fittedvalues[-1]
>>> 
0.93085789893991366

>>> x[0, -3]
0.93037546054487086

>>> np.testing.assert_almost_equal(y_pred, x[0, -3], decimal=2)
>>> 

以上是为了预测最后一次观察。 要预测第一个样本外观察,我们需要最后一个和倒数第二个 x,这是无法通过 _transform_x 函数获得的。 例如,我只是在列表中提供它。

>>> y_pred = result.predict(len(y), len(y), exog=[[x[0, -1], x[0, -2]]])
>>> y_pred
array([ 1.35420494])
>>> x[0, -2]
1.3538704268828403
>>> np.testing.assert_almost_equal(y_pred, x[0, -2], decimal=2)
>>> 

更一般地说,为了预测更长远的范围,我们需要一系列未来的解释变量

>>> xx = np.concatenate((x, np.ones((x.shape[0], 10))), axis=1)
>>> result.predict(len(y), len(y)+9, exog=_transform_x(xx[:, -(10+lag):], lag)[-10:])
>>> 
array([ 1.35420494,  0.81332158,  1.00030139,  1.00030334,  1.000303  ,
        1.00030299,  1.00030299,  1.00030299,  1.00030299,  1.00030299])

我选择了索引,以便 predict 的 exog 包含第一行中的最后两个观察值。

>>> _transform_x(xx[:, -(10+lag):], lag)[lag:]
array([[ 0.81304498,  1.35387043],
       [ 1.        ,  0.81304498],
       [ 1.        ,  1.        ],
       ..., 
       [ 1.        ,  1.        ],
       [ 1.        ,  1.        ],
       [ 1.        ,  1.        ]])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM