Linear multiple regression with OLS in Python does not work

Question

I am following a course in econometrics but I'm stuck unfortunately.

I hope you can help me.

The following model is given:
https://i.stack.imgur.com/DfYCN.png

The OLS estimator of beta is given by: https://i.stack.imgur.com/r7bHD.png

But when I run the following python script with very large sample size the estimators are terrible and do not converge to the true values. Could anyone explain this to me please?

'''

n = 100000
beta1 = 5.
beta2 = -.02
beta3 = .2


constant_term = np.ones(n)
X1 = np.linspace(10,30,n)
X2 = np.linspace(0,10,n)

X = np.column_stack((constant_term, X1, X2))

Y = np.zeros(n)
for i in range(n):
    u = np.random.normal(0.,1.)
    Y[i] = beta1 + beta2 * X[i,1] + beta3 * X[i,2] + u

Xt = np.transpose(X)

beta_ols = np.linalg.inv(Xt @ X) @ Xt @ Y

print(beta_ols)

''' It returns for example [ 4.66326351 -0.32281745 0.87127398] but the true values are [5., -.02, .2]

I am aware that there also are function that can do this for me, but I want to do it manually to understand the material better.

Thanks!

Answer 1

You variables X and X2 are collinear, ie not linearly independent. Hence you matrix Xt @ X is not of full rank. Eigevalues:

np.linalg.eig(Xt @ X)[0]

prints

[4.65788929e+07, 3.72227442e-11, 1.87857084e+05]

note the second one is basically 0. Not exactly zero due to rounding etc. But when you invert this matrix you essentially divide by this very small number and massively lose precision. There are many ways to address it, for example look up Tikhonov regularization . In Python you can use Ridge regression from sklearn-kit

Of course if you do not want to get into finer details you can just modify your code to make sure your two variables are linearly independent, eg you can replace X2 initialization with

X2 = np.linspace(0,10,n)**2

Linear multiple regression with OLS in Python does not work

Question

1 answers

solution1
0 ACCPTED 2020-11-11 13:59:36

Linear multiple regression with OLS in Python does not work

Question

1 answers

solution1 0 ACCPTED 2020-11-11 13:59:36

solution1
0 ACCPTED 2020-11-11 13:59:36