Dimensions do not match in linear regression

Question

I am trying a simple linear regression model but don't understand why an error like this appears:

Here is my code:

from sklearn import linear_model
regr = linear_model.LinearRegression()
regr.fit(X, Y)

which produces following error:

ValueError: Found input variables with inconsistent numbers of samples: [1518, 15]

The shapes of X and Y are:

X.shape, Y.shape
((1518, 1), (15, 1))

I am trying to predict these Y out of X but my dimensions are not the same; how can I overcome this problem?

Answer 1

It looks like you split your features and explanatory variables wrong way.

Given on what you have written, you have N=1518 samples and 15 features, one of which is the outcome variable.

If this is the case you input vector for Y and matrix for X should take the shapes:

X.shape = (1518,14)
Y.shape = (1518,1)

Assume you are given a pd.dataframe , with features names F1...F15 and your dependent variable Y is F3 , then you can split your variables as follows:

Y = df['F3']
X = df.drop('F3', axis=1)

Note: if you are currently using a numpy array, you an easily wrap this in a dataframe using:

import pandas as pd
df = pd.DataFrame(np_array)

Dimensions do not match in linear regression

Question

1 answers

solution1
1 2020-01-09 14:06:08

Dimensions do not match in linear regression

Question

1 answers

solution1 1 2020-01-09 14:06:08

solution1
1 2020-01-09 14:06:08