[英]In sklearn logisticRegression, what is the Fit_intercept=False MEANING?
I tried to compare logistic regression result from statsmodel with sklearn logisticRegression result.我试图将 statsmodel 的逻辑回归结果与 sklearn logisticRegression 结果进行比较。 actually I tried to compare with R result also.
实际上我也尝试与 R 结果进行比较。 I made the options C=1e6(no penalty) but I got almost same coefficients except the intercept.
我做了选项 C=1e6(无惩罚),但除了截距外,我得到了几乎相同的系数。
model = sm.Logit(Y, X).fit()
print(model.summary())
==> intercept = 5.4020 ==>拦截= 5.4020
model = LogisticRegression(C=1e6,fit_intercept=False)
model = model.fit(X, Y)
===> intercept = 2.4508 ===>截距= 2.4508
so I read the user guide, they said Specifies if a constant (aka bias or intercept) should be added to the decision function. what is this meaning?所以我阅读了用户指南,他们说指定是否应将常量(又名偏置或截距)添加到决策 function 中。这是什么意思? due to this, sklearn logisticRegression gave a different intercept value?
因此,sklearn logisticRegression 给出了不同的截距值?
please help me请帮我
LogisticRegression is in some aspects similar to the Perceptron Model and LinearRegression. LogisticRegression 在某些方面类似于 Perceptron Model 和 LinearRegression。 You multiply your weights with the data points and compare it to a threshold value b :
您将权重与数据点相乘并将其与阈值b进行比较:
w_1 * x_1 + ... + w_n*x_n > b
This can be rewritten as:这可以重写为:
-b + w_1 * x_1 + ... + w_n*x_n > 0
or或者
w_0 * 1 + w_1 * x_1 + ... + w_n*x_n > 0
For linear regression we keep this, for the perceptron we feed this to a chosen function and here for the logistic regression pass this to the logistic function.对于线性回归,我们保留它,对于感知器,我们将其提供给选定的 function,而对于逻辑回归,我们将其传递给逻辑 function。
Instead of learning n parameters now n+1 are learned.现在学习n+1个参数,而不是学习n 个参数。 For the perceptron it is called bias, for regression intercept.
对于感知器,它称为偏差,用于回归截距。
For linear regression it's easy to understand geometrically.对于线性回归,从几何学上很容易理解。 In the 2D case you can think about this as a shifting the decision boundary by w_0 in the y direction**.
在 2D 情况下,您可以将此视为在 y 方向上将决策边界移动 w_0**。
or y = m*x
vs y = m*x + c
So now the decision boundary does not go through (0,0) anymore.或
y = m*x
vs y = m*x + c
所以现在决策边界不再是 go 通过 (0,0)。
For the logistic function it is similar it shifts it away for the origin.对于后勤 function 它是类似的,它将它转移到原点。
Implementation wise what happens, you add one more weight and a constant 1 to the X values.实施明智的情况是,您将一个权重和一个常量 1 添加到 X 值。 And then you proceed as normal.
然后你照常进行。
if fit_intercept:
intercept = np.ones((X_train.shape[0], 1))
X_train = np.hstack((intercept, X_train))
weights = np.zeros(X_train.shape[1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.