简体   繁体   中英

After training the Linear Regression model using scikit-learn , How to do predictions for new data points which are not there in original data set?

I am learning Linear regression, I wrote this Linear Regression code using scikit-learn, after making the prediction, how to do prediction for new data points which are not there in my original data set.

In this data set you are given the salaries of people according to their work experience.

For example, The predicted salary for a person with work experience of 15 years should be [167005.32889087]

Here is image of data set

Here is my code,

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression

data = pd.read_csv('project_1_dataset.csv')

X = data.iloc[:,0].values.reshape(-1,1)

Y = data.iloc[:,1].values.reshape(-1,1)

linear_regressor = LinearRegression()

linear_regressor.fit(X,Y)

Y_pred = linear_regressor.predict(X)

plt.scatter(X,Y)
plt.plot(X, Y_pred, color = 'red')
plt.show()

After fitting and training your model with your existed dataset (ie after linear_regressor.fit(X,Y) ), you could make predictions in new instances in the same way:

new_prediction = linear_regressor.predict(new_data)
print(new_prediction)

where new_data is your new data point.

If you want to make predictions on particular random new data points, the above way should be enough. If your new data points belong to another dataframe, then you could replace new_data with the respective dataframe containing the new instances to be predicted.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM