[英]After training the Linear Regression model using scikit-learn , How to do predictions for new data points which are not there in original data set?
I am learning Linear regression, I wrote this Linear Regression code using scikit-learn, after making the prediction, how to do prediction for new data points which are not there in my original data set.我正在学习线性回归,我使用 scikit-learn 编写了这个线性回归代码,在做出预测之后,如何对原始数据集中不存在的新数据点进行预测。
In this data set you are given the salaries of people according to their work experience.在此数据集中,您将根据工作经验获得人们的薪水。
For example, The predicted salary for a person with work experience of 15 years should be [167005.32889087]例如,一个有 15 年工作经验的人的预计薪水应该是 [167005.32889087]
Here is image of data set这是数据集的图像
Here is my code,这是我的代码,
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
data = pd.read_csv('project_1_dataset.csv')
X = data.iloc[:,0].values.reshape(-1,1)
Y = data.iloc[:,1].values.reshape(-1,1)
linear_regressor = LinearRegression()
linear_regressor.fit(X,Y)
Y_pred = linear_regressor.predict(X)
plt.scatter(X,Y)
plt.plot(X, Y_pred, color = 'red')
plt.show()
After fitting and training your model with your existed dataset (ie after linear_regressor.fit(X,Y)
), you could make predictions in new instances in the same way:在使用现有数据集(即在
linear_regressor.fit(X,Y)
之后)拟合和训练 model 之后,您可以以相同的方式在新实例中进行预测:
new_prediction = linear_regressor.predict(new_data)
print(new_prediction)
where new_data
is your new data point.其中
new_data
是您的新数据点。
If you want to make predictions on particular random new data points, the above way should be enough.如果您想对特定的随机新数据点进行预测,上述方式应该足够了。 If your new data points belong to another dataframe, then you could replace
new_data
with the respective dataframe containing the new instances to be predicted.如果您的新数据点属于另一个 dataframe,那么您可以将
new_data
替换为包含要预测的新实例的相应 dataframe。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.