I am trying a simple linear regression model but don't understand why an error like this appears:
Here is my code:
from sklearn import linear_model
regr = linear_model.LinearRegression()
regr.fit(X, Y)
which produces following error:
ValueError: Found input variables with inconsistent numbers of samples: [1518, 15]
The shapes of X and Y are:
X.shape, Y.shape
((1518, 1), (15, 1))
I am trying to predict these Y out of X but my dimensions are not the same; how can I overcome this problem?
It looks like you split your features and explanatory variables wrong way.
Given on what you have written, you have N=1518
samples and 15 features, one of which is the outcome variable.
If this is the case you input vector for Y and matrix for X should take the shapes:
X.shape = (1518,14)
Y.shape = (1518,1)
Assume you are given a pd.dataframe
, with features names F1...F15
and your dependent variable Y is F3
, then you can split your variables as follows:
Y = df['F3']
X = df.drop('F3', axis=1)
Note: if you are currently using a numpy array, you an easily wrap this in a dataframe using:
import pandas as pd
df = pd.DataFrame(np_array)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.