简体   繁体   中英

linear regression train/shape output not correct

I'm trying to use linear regression to predict the amount of releases of show there will be in the upcoming years. I have a data frame where ever row is a release with column having info like release year, genre, ... I would like to use this to predict the amount of upcoming releases, so what I've done is make a new dataframe made up of all the unique years and a count_values to get the amount of releases that year. so now i have 85 lines of with 2 columns 1 with the year and the other with the amount of releases.

I'm u sing sklearn for this and this is the code I've made so far.

x = ML_content.drop('releases', axis = 1)
#x = ML_content['years']
y = ML_content['releases']
x_train, y_train, x_test, y_test = train_test_split(x, y, test_size = 20)
x_train.shape, y_train.shape
model = linear_model.LinearRegression()
model.fit(x_train, y_train)

The result of the shape process is I believe incorrect for what i want (this is the result: ((42, 1), (43, 1)) ) and for that reason the following code also won't work. Could anyone explain me what I'm doing wrong or what needs to happen to change this.

Thank you for your time and help

according to https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
the return of train_test_split is in another order then you did.
returned order is: X_train, X_test, y_train, y_test
you got: x_train, y_train, x_test, y_test

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM