使用 scikit-learn 训练线性回归 model 后，如何对原始数据集中不存在的新数据点进行预测？

Question

I am learning Linear regression, I wrote this Linear Regression code using scikit-learn, after making the prediction, how to do prediction for new data points which are not there in my original data set.我正在学习线性回归，我使用 scikit-learn 编写了这个线性回归代码，在做出预测之后，如何对原始数据集中不存在的新数据点进行预测。

In this data set you are given the salaries of people according to their work experience.在此数据集中，您将根据工作经验获得人们的薪水。

For example, The predicted salary for a person with work experience of 15 years should be [167005.32889087]例如，一个有 15 年工作经验的人的预计薪水应该是 [167005.32889087]

Here is image of data set这是数据集的图像

Here is my code,这是我的代码，

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression

data = pd.read_csv('project_1_dataset.csv')

X = data.iloc[:,0].values.reshape(-1,1)

Y = data.iloc[:,1].values.reshape(-1,1)

linear_regressor = LinearRegression()

linear_regressor.fit(X,Y)

Y_pred = linear_regressor.predict(X)

plt.scatter(X,Y)
plt.plot(X, Y_pred, color = 'red')
plt.show()

Answer 1

After fitting and training your model with your existed dataset (ie after linear_regressor.fit(X,Y) ), you could make predictions in new instances in the same way:在使用现有数据集（即在linear_regressor.fit(X,Y)之后）拟合和训练 model 之后，您可以以相同的方式在新实例中进行预测：

new_prediction = linear_regressor.predict(new_data)
print(new_prediction)

where new_data is your new data point.其中new_data是您的新数据点。

If you want to make predictions on particular random new data points, the above way should be enough.如果您想对特定的随机新数据点进行预测，上述方式应该足够了。 If your new data points belong to another dataframe, then you could replace new_data with the respective dataframe containing the new instances to be predicted.如果您的新数据点属于另一个 dataframe，那么您可以将new_data替换为包含要预测的新实例的相应 dataframe。

使用 scikit-learn 训练线性回归 model 后，如何对原始数据集中不存在的新数据点进行预测？

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-12-19 20:38:47

使用 scikit-learn 训练线性回归 model 后，如何对原始数据集中不存在的新数据点进行预测？

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-12-19 20:38:47

解决方案1
1 已采纳 2020-12-19 20:38:47