I've trained a OLS model in Python using statsmodels OLS. With the code below i've trained the model:
import statsmodels.api as sm
X2 = sm.add_constant(lin_x_train)
est = sm.OLS(lin_y_train, X2)
est2 = est.fit()
Using est2.params I obtain the following paramters:
const -0.394654
pow2 0.920915
eth_36hr -0.028754
eth_24dhr -0.068346
eth_16hr 0.064768
eth_72hr 0.001774
eth_48hr 0.001239
eth_24hr 0.026940
eth_2hr -0.163568
eth_3hr -0.042497
eth_4hr 0.033180
eth_5hr -0.029850
eth_6hr -0.040417
Now I want to predict the following case:
pow2 0
eth_36hr 2.91
eth_24dhr 1.34
eth_16hr 1.13
eth_72hr 13
eth_48hr 6.66
eth_24hr -9.89
eth_2hr -3.7
eth_3hr 2.37
eth_4hr 2.36
eth_5hr -2.28
eth_6hr -5.27
Since I've trained a OLS model I was under the assumption that it was simply:
y = a + B1 * X1 + B2 *X2 + .... Bn*Xn
When I compute this myself I get a Y value of 0.132 However using:
Xnew = newcase
Xnew = sm.add_constant(Xnew)
est2.predict(Xnew)
I get a value of 0.699
What am I missing?
Nb using LinearRegression from sklearn I get the same value of 0.699. So I'm clearly missing something. But I can't get my head around it.
What I was missing was indeed quite simple and embarrassing. I switched 2 variable names around, resulting in wrong predictions manually. So, the formula was correct:
y = a + B1 * X1 + B2 *X2 + .... Bn*Xn
Before discovering it, I just worked around by saving the model and importing it to perform the predictions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.