简体   繁体   English

如何从statsmodels中的WLS回归的2D参数获取测试预测

[英]How to get the prediction of test from 2D parameters of WLS regression in statsmodels

I'm incrementally up the parameters of WLS regression functions using statsmodels. 我使用statsmodels逐步增加WLS回归函数的参数。

I have a 10x3 dataset X that I declared like this: 我有一个这样声明的10x3数据集X:

X = np.array([[1,2,3],[1,2,3],[4,5,6],[1,2,3],[4,5,6],[1,2,3],[1,2,3],[4,5,6],[4,5,6],[1,2,3]])

This is my dataset, and I have a 10x2 endog vector that looks like this: 这是我的数据集,我有一个10x2的endog向量,看起来像这样:

z =
[[  3.90311860e-322   2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]]

Now after importing import statsmodels.api as sm I do this: 现在import statsmodels.api as sm导入import statsmodels.api as sm我这样做:

g = np.zeros([3, 2]) # g(x) is a function that will store the regression parameters
mod_wls = sm.WLS(z, X)
temp_g = mod_wls.fit()
print temp_g.params

And I get this output: 我得到以下输出:

[[ -5.92878775e-323  -2.77777778e+000]
 [ -4.94065646e-324  -4.44444444e-001]
 [  4.94065646e-323   1.88888889e+000]]

Earlier, from the answer to this question , I was able to predict the value of test data X_test using numpy.dot , like this: 之前,从该问题的答案中 ,我能够使用numpy.dot预测测试数据X_test的值,如下所示:

np.dot(X_test, temp_g.params)

I understood that easily since it the endog vector, y was a 1D array. 我很容易理解,因为它是endg向量,所以y是一维数组。 But how does it work when my endog vector, in this case, z , is 2D? 但是当我的endg向量(在这种情况下为z )为2D时,它如何工作? When I try the above line as was used in the 1D version, I get the following error: 当我尝试上述线在1D版本中使用时,出现以下错误:

   self._check_integrity()
  File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\data.py", line 247, in _check_integrity
    raise ValueError("endog and exog matrices are different sizes")
ValueError: endog and exog matrices are different sizes

np.dot(X_test, temp_g.params) should still work. np.dot(X_test, temp_g.params)应该仍然可以工作。

In some cases you need to check what the orientation of the matrices are, sometimes it's necessary to transpose 在某些情况下,您需要检查矩阵的方向,有时需要转置

However predict and most other methods of the results will not work, because the model assumes that dependent variable, z, is 1D. 但是,结果的predict方法和大多数其他方法将不起作用,因为该模型假定因变量z为一维。

The question is again what you are trying to do? 问题再次是您要做什么?

If you want to independently fit columns of z, then iterate over it so each y is 1D. 如果要独立适合z的列,请对其进行迭代,以便每个y为1D。

for y in zT: res = WLS(y, X).fit()

zT allows iteration over columns. zT允许在列上进行迭代。

In other cases, we usually stack the model so that y is 1D and first part of it is z[:,0] and the second part of the column is z[:,1] . 在其他情况下,我们通常将模型堆叠起来,以使y为1D且其第一部分为z[:,0] ,而列的第二部分为z[:,1] The design matrix or matrix of explanatory variables has to be expanded correspondingly. 设计矩阵或解释变量矩阵必须相应地扩展。

Support for multivariate dependent variables is in the making for statsmodels but will still take some time to be ready. 支持多元因变量是statsmodels的组成部分,但仍需要一些时间来准备。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM