在 sklearn logisticRegression 中，Fit_intercept=False 的含义是什么？

Question

I tried to compare logistic regression result from statsmodel with sklearn logisticRegression result.我试图将 statsmodel 的逻辑回归结果与 sklearn logisticRegression 结果进行比较。 actually I tried to compare with R result also.实际上我也尝试与 R 结果进行比较。 I made the options C=1e6(no penalty) but I got almost same coefficients except the intercept.我做了选项 C=1e6（无惩罚），但除了截距外，我得到了几乎相同的系数。

model = sm.Logit(Y, X).fit()
print(model.summary())

==> intercept = 5.4020 ==>拦截= 5.4020

model = LogisticRegression(C=1e6,fit_intercept=False)
model = model.fit(X, Y)

===> intercept = 2.4508 ===>截距= 2.4508

so I read the user guide, they said Specifies if a constant (aka bias or intercept) should be added to the decision function. what is this meaning?所以我阅读了用户指南，他们说指定是否应将常量（又名偏置或截距）添加到决策 function 中。这是什么意思？ due to this, sklearn logisticRegression gave a different intercept value?因此，sklearn logisticRegression 给出了不同的截距值？

please help me请帮我

Answer 1

LogisticRegression is in some aspects similar to the Perceptron Model and LinearRegression. LogisticRegression 在某些方面类似于 Perceptron Model 和 LinearRegression。 You multiply your weights with the data points and compare it to a threshold value b :您将权重与数据点相乘并将其与阈值b进行比较：

w_1 * x_1 + ... + w_n*x_n > b

This can be rewritten as:这可以重写为：

-b + w_1 * x_1 + ... + w_n*x_n > 0

or或者

 w_0 * 1 + w_1 * x_1 + ... + w_n*x_n > 0

For linear regression we keep this, for the perceptron we feed this to a chosen function and here for the logistic regression pass this to the logistic function.对于线性回归，我们保留它，对于感知器，我们将其提供给选定的 function，而对于逻辑回归，我们将其传递给逻辑 function。

Instead of learning n parameters now n+1 are learned.现在学习n+1个参数，而不是学习n 个参数。 For the perceptron it is called bias, for regression intercept.对于感知器，它称为偏差，用于回归截距。

For linear regression it's easy to understand geometrically.对于线性回归，从几何学上很容易理解。 In the 2D case you can think about this as a shifting the decision boundary by w_0 in the y direction**.在 2D 情况下，您可以将此视为在 y 方向上将决策边界移动 w_0**。
or y = m*x vs y = m*x + c So now the decision boundary does not go through (0,0) anymore.或y = m*x vs y = m*x + c所以现在决策边界不再是 go 通过 (0,0)。

For the logistic function it is similar it shifts it away for the origin.对于后勤 function 它是类似的，它将它转移到原点。

Implementation wise what happens, you add one more weight and a constant 1 to the X values.实施明智的情况是，您将一个权重和一个常量 1 添加到 X 值。 And then you proceed as normal.然后你照常进行。

if fit_intercept:
   intercept = np.ones((X_train.shape[0], 1))
   X_train   = np.hstack((intercept, X_train))
   weights   = np.zeros(X_train.shape[1])

在 sklearn logisticRegression 中，Fit_intercept=False 的含义是什么？

问题描述

1 个解决方案

解决方案1
0 2022-10-09 13:50:15

在 sklearn logisticRegression 中，Fit_intercept=False 的含义是什么？

问题描述

1 个解决方案

解决方案1 0 2022-10-09 13:50:15

解决方案1
0 2022-10-09 13:50:15