简体   繁体   中英

Dimensions do not match in linear regression

I am trying a simple linear regression model but don't understand why an error like this appears:

Here is my code:

from sklearn import linear_model
regr = linear_model.LinearRegression()
regr.fit(X, Y)

which produces following error:

ValueError: Found input variables with inconsistent numbers of samples: [1518, 15]

The shapes of X and Y are:

X.shape, Y.shape
((1518, 1), (15, 1))

I am trying to predict these Y out of X but my dimensions are not the same; how can I overcome this problem?

It looks like you split your features and explanatory variables wrong way.

Given on what you have written, you have N=1518 samples and 15 features, one of which is the outcome variable.

If this is the case you input vector for Y and matrix for X should take the shapes:

X.shape = (1518,14)
Y.shape = (1518,1)

Assume you are given a pd.dataframe , with features names F1...F15 and your dependent variable Y is F3 , then you can split your variables as follows:

Y = df['F3']
X = df.drop('F3', axis=1)

Note: if you are currently using a numpy array, you an easily wrap this in a dataframe using:

import pandas as pd
df = pd.DataFrame(np_array)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM