简体   繁体   中英

statsmodels SARIMAX with exogenous variables matrices are different sizes

I'm running a SARIMAX model but running into problems with specifying the exogenous variables. In the first block of code (below) I specify one exogenous variable lesdata['LESpost'] and the model runs without a problem. However, when I add in another exogenous variable I end up with an error message (see stack trace).

ar = (1,0,1)      #  AR(1 3)
ma = (0)  #  No MA terms
mod1 = sm.tsa.statespace.SARIMAX(lesdata['emadm'], exog= (lesdata['LESpost'],lesdata['QOF']), trend='c', order=(ar,0,ma), mle_regression=True)

Traceback (most recent call last):

  File "<ipython-input-129-d1300aeaeffc>", line 4, in <module>
    mle_regression=True)

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\tsa\statespace\sarimax.py", line 510, in __init__
    endog, exog=exog, k_states=k_states, k_posdef=k_posdef, **kwargs

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 84, in __init__
    missing='none')

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 43, in __init__
    super(TimeSeriesModel, self).__init__(endog, exog, missing=missing)

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\model.py", line 212, in __init__
    super(LikelihoodModel, self).__init__(endog, exog, **kwargs)

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\model.py", line 63, in __init__
    **kwargs)

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\model.py", line 88, in _handle_data
    data = handle_data(endog, exog, missing, hasconst, **kwargs)

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 630, in handle_data
    **kwargs)

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 80, in __init__
    self._check_integrity()

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 496, in _check_integrity
    super(PandasData, self)._check_integrity()

  File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 403, in _check_integrity
    raise ValueError("endog and exog matrices are different sizes")

ValueError: endog and exog matrices are different sizes

Is there something obvious I am missing here? The variables are all of the same length and there are no missing data.

Thanks for reading and hope you can help !

Two dimensional data needs to have observations in row and variables in columns after applying numpy.asarray.

exog = (lesdata['LESpost'],lesdata['QOF'])

Applying asarray to this tuple puts the variables in rows which is the numpy default from the C origin which is not what statsmodels wants.

DataFrames are already shaped in the appropriate way, so one option is to use a DataFrame with the desired columns

exog = lesdata[['LESpost', 'QOF']]

Another option for list or tuples of array_likes is to use numpy.column_stack , eg

exog = np.column_stack((lesdata['LESpost'].values,lesdata['QOF'].values))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM