简体   繁体   English

有没有办法从使用 sklearn 进行多元线性回归的预测中获取解释变量的值?

[英]Is there a way to get the values for the explanatory variables from a prediction using sklearn for multiple linear regression?

I'm trying to understand if there is a way, using the sklearn.linear_model.LinearRegression() to get the values for the explanatory variables, given a prediction of X .我试图了解是否有一种方法,使用sklearn.linear_model.LinearRegression()来获取解释变量的值,给定X的预测。

For example - looking at MPG of a car.例如 - 查看汽车的 MPG。 I can build the model using multiple explanatory variables and then predict (successfully) the MPG for a given set of X. However, can I do the reverse and give Y and then get the predict X values?我可以使用多个解释变量构建 model,然后(成功地)预测给定 X 集的 MPG。但是,我可以做相反的事情并给出Y ,然后得到预测的X值吗?

Sorry if not very clear!很抱歉,如果不是很清楚!

When approximating some values Y associated to some points X using linear regression what we are looking is for the nearest linear function (f(x) = ax + k) to the points in a least square sense.当使用线性回归逼近与某些点X相关的某些值Y时,我们正在寻找的是与最小二乘意义上的点最接近的线性 function (f(x) = ax + k)。 So you are not getting Y but you are rather getting the linear function that best approximates your input.因此,您没有得到Y ,而是得到最接近您输入的线性 function。

Can you do the opposite that is build a function that predicts $X$ rather than $Y$.你能做相反的事情,那就是构建一个预测 $X$ 而不是 $Y$ 的 function。 Consider the example in sklearn.linear_model.LinearRegression we are going to tune it a bit.考虑sklearn.linear_model.LinearRegression中的示例,我们将对其进行一些调整。

>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
>>> # y = 1 * x_0 + 2 * x_1 + 3
>>> y = np.dot(X, np.array([1, 2])) + 3
>>> # Changing y <-> X 
>>> reg = LinearRegression().fit(y, X)
>>> # Changing y <-> X 
>>> reg.score(y, X)
0.8269230769230769
>>> reg.coef_
array([[0.23076923],
       [0.38461538]])
>>> reg.intercept_
array([-0.46153846, -1.26923077])
>>> reg.predict(np.array([[16.]]))
array([[3.23076923, 4.88461538]])

This could let you predict your explanatory variables from the MPG of your vehicles.这可以让您根据车辆的 MPG 预测解释变量。 Although this works take into account that this approach will probably give you very bad results as you are trying to approach a cloud of points inside a multidimensional space with a line inside that space.尽管这项工作考虑到这种方法可能会给您带来非常糟糕的结果,因为您正试图通过该空间内的一条线来接近多维空间内的点云。 Take a look at your score before trying to predict anything with your model.在尝试使用 model 预测任何内容之前,请先查看您的分数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM