linear regression train/shape output not correct

Question

I'm trying to use linear regression to predict the amount of releases of show there will be in the upcoming years. I have a data frame where ever row is a release with column having info like release year, genre, ... I would like to use this to predict the amount of upcoming releases, so what I've done is make a new dataframe made up of all the unique years and a count_values to get the amount of releases that year. so now i have 85 lines of with 2 columns 1 with the year and the other with the amount of releases.

I'm u sing sklearn for this and this is the code I've made so far.

x = ML_content.drop('releases', axis = 1)
#x = ML_content['years']
y = ML_content['releases']
x_train, y_train, x_test, y_test = train_test_split(x, y, test_size = 20)
x_train.shape, y_train.shape
model = linear_model.LinearRegression()
model.fit(x_train, y_train)

The result of the shape process is I believe incorrect for what i want (this is the result: ((42, 1), (43, 1)) ) and for that reason the following code also won't work. Could anyone explain me what I'm doing wrong or what needs to happen to change this.

Thank you for your time and help

Answer 1

according to https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
the return of train_test_split is in another order then you did.
returned order is: X_train, X_test, y_train, y_test
you got: x_train, y_train, x_test, y_test

linear regression train/shape output not correct

Question

1 answers

solution1
1 ACCPTED 2020-07-15 11:05:37

linear regression train/shape output not correct

Question

1 answers

solution1 1 ACCPTED 2020-07-15 11:05:37

solution1
1 ACCPTED 2020-07-15 11:05:37