繁体   English   中英

Statsmodels - TypeError:输入类型不支持 ufunc 'isnan'

[英]Statsmodels - TypeError: ufunc 'isnan' not supported for the input types

我试图运行一段代码来评估 SARIMAX model 中 P、D 和 Q 不同值的均方误差。 这个确切的代码块以前对我来说工作得很好,我没有在任何地方更改它,所以我只能假设问题是数据,但我也以同样的方式处理了它,所以我不知道为什么它不行吗?


def evaluate_sarima_model(data, arima_order, s_order): 
    split=int(len(data) * 0.8) 
    train, test = data[0:split], data[split:len(data)]
    past=[x for x in train]
    # make predictions
    predictions = list()
    for i in range(len(test)):
        model = sm.tsa.statespace.SARIMAX(past, order=arima_order, seasonal_order = s_order, enforce_stationarity=False, enforce_invertibility=False)
        model_fit = model.fit(disp=0)
        future = model_fit.forecast()[0]
        predictions.append(future)
        past.append(test[i])
    # calculate out of sample error
    error = mean_squared_error(test, predictions)
    return error
      
def evaluate_models(dataset, p_values, d_values, q_values, P_values, D_values, Q_values):
    best_score, best_cfg = float("inf"), None
    for p in p_values:
        for d in d_values:
            for q in q_values:
                for P in P_values:
                    for D in D_values:
                        for Q in Q_values:
                            order = (p,d,q)
                            s_order = (P, D, Q, 12)
                            try:
                                mse = evaluate_sarima_model(dataset, order, s_order)
                                if mse < best_score:
                                    best_score, best_cfg, seas = mse, order, s_order
                                print('SARIMA%s %s MSE=%.3f' % (order,seas, mse))
                            except:
                                continue
    return print('Best SARIMA%s %s MSE=%.3f' % (best_cfg, seas, best_score))
p_values = [1]
d_values = [1] 
q_values = [1] 
P_values = [x for x in range(0, 3)]
D_values = [x for x in range(0, 3)]
Q_values = [x for x in range(0, 3)] 

我正在使用以下数据集:

DatetimeIndex: 175 entries, 2005-12-01 to 2020-06-01
Freq: MS
Data columns (total 1 columns):
 #   Column    Non-Null Count  Dtype
---  ------    --------------  -----
 0   turnover  175 non-null    int32
dtypes: int32(1)
memory usage: 7.1 KB 

当我运行它时,我收到以下错误:


evaluate_models(turnover_month, p_values, d_values, q_values, P_values, D_values, Q_values)

UnboundLocalError: local variable 'seas' referenced before assignment 

如果我尝试使用 P、D 和 QI 的随机值运行单个 model 行,那么毫无价值,因此我的假设是问题出在第一个块中以及它如何处理此数据集:

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

在检查了每一行之后,很明显问题出在数据上,而 SARIMA model 无法处理 DataFrame,而是需要 Series。 我只需要 df 上的 simple.Squeeze()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM