[英]How to get the prediction of test from 2D parameters of WLS regression in statsmodels
I'm incrementally up the parameters of WLS regression functions using statsmodels. 我使用statsmodels逐步增加WLS回归函数的参数。
I have a 10x3 dataset X that I declared like this: 我有一个这样声明的10x3数据集X:
X = np.array([[1,2,3],[1,2,3],[4,5,6],[1,2,3],[4,5,6],[1,2,3],[1,2,3],[4,5,6],[4,5,6],[1,2,3]])
This is my dataset, and I have a 10x2 endog
vector that looks like this: 这是我的数据集,我有一个10x2的endog
向量,看起来像这样:
z =
[[ 3.90311860e-322 2.00000000e+000]
[ 0.00000000e+000 2.00000000e+000]
[ 0.00000000e+000 -2.00000000e+000]
[ 0.00000000e+000 2.00000000e+000]
[ 0.00000000e+000 -2.00000000e+000]
[ 0.00000000e+000 2.00000000e+000]
[ 0.00000000e+000 2.00000000e+000]
[ 0.00000000e+000 -2.00000000e+000]
[ 0.00000000e+000 -2.00000000e+000]
[ 0.00000000e+000 2.00000000e+000]]
Now after importing import statsmodels.api as sm
I do this: 现在import statsmodels.api as sm
导入import statsmodels.api as sm
我这样做:
g = np.zeros([3, 2]) # g(x) is a function that will store the regression parameters
mod_wls = sm.WLS(z, X)
temp_g = mod_wls.fit()
print temp_g.params
And I get this output: 我得到以下输出:
[[ -5.92878775e-323 -2.77777778e+000]
[ -4.94065646e-324 -4.44444444e-001]
[ 4.94065646e-323 1.88888889e+000]]
Earlier, from the answer to this question , I was able to predict the value of test data X_test
using numpy.dot
, like this: 之前,从该问题的答案中 ,我能够使用numpy.dot
预测测试数据X_test
的值,如下所示:
np.dot(X_test, temp_g.params)
I understood that easily since it the endog vector, y
was a 1D array. 我很容易理解,因为它是endg向量,所以y
是一维数组。 But how does it work when my endog vector, in this case, z
, is 2D? 但是当我的endg向量(在这种情况下为z
)为2D时,它如何工作? When I try the above line as was used in the 1D version, I get the following error: 当我尝试上述线在1D版本中使用时,出现以下错误:
self._check_integrity()
File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\data.py", line 247, in _check_integrity
raise ValueError("endog and exog matrices are different sizes")
ValueError: endog and exog matrices are different sizes
np.dot(X_test, temp_g.params)
should still work. np.dot(X_test, temp_g.params)
应该仍然可以工作。
In some cases you need to check what the orientation of the matrices are, sometimes it's necessary to transpose 在某些情况下,您需要检查矩阵的方向,有时需要转置
However predict
and most other methods of the results will not work, because the model assumes that dependent variable, z, is 1D. 但是,结果的predict
方法和大多数其他方法将不起作用,因为该模型假定因变量z为一维。
The question is again what you are trying to do? 问题再次是您要做什么?
If you want to independently fit columns of z, then iterate over it so each y is 1D. 如果要独立适合z的列,请对其进行迭代,以便每个y为1D。
for y in zT: res = WLS(y, X).fit()
zT
allows iteration over columns. zT
允许在列上进行迭代。
In other cases, we usually stack the model so that y is 1D and first part of it is z[:,0]
and the second part of the column is z[:,1]
. 在其他情况下,我们通常将模型堆叠起来,以使y为1D且其第一部分为z[:,0]
,而列的第二部分为z[:,1]
。 The design matrix or matrix of explanatory variables has to be expanded correspondingly. 设计矩阵或解释变量矩阵必须相应地扩展。
Support for multivariate dependent variables is in the making for statsmodels but will still take some time to be ready. 支持多元因变量是statsmodels的组成部分,但仍需要一些时间来准备。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.