在python中查找邏輯回歸的系數

Question

我正在研究分類問題，需要邏輯回歸方程的系數。 我可以在 R 中找到系數，但我需要在 python 中提交項目。 我在python中找不到學習邏輯回歸系數的代碼。 如何在python中獲取系數值？

Answer 1

sklearn.linear_model.LogisticRegression適合您。 看這個例子：

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
clf = LogisticRegression(random_state=0).fit(X, y)

print(clf.coef_, clf.intercept_)

Answer 2

statsmodels庫將為您提供系數結果的細分以及相關的 p 值以確定它們的顯着性。

使用 x1 和 y1 變量的示例：

x1_train, x1_test, y1_train, y1_test = train_test_split(x1, y1, random_state=0)

logreg = LogisticRegression().fit(x1_train,y1_train)
logreg

print("Training set score: {:.3f}".format(logreg.score(x1_train,y1_train)))
print("Test set score: {:.3f}".format(logreg.score(x1_test,y1_test)))

import statsmodels.api as sm
logit_model=sm.Logit(y1,x1)
result=logit_model.fit()
print(result.summary())

結果示例：

Optimization terminated successfully.
         Current function value: 0.596755
         Iterations 7
                           Logit Regression Results                           
==============================================================================
Dep. Variable:             IsCanceled   No. Observations:                20000
Model:                          Logit   Df Residuals:                    19996
Method:                           MLE   Df Model:                            3
Date:                Sat, 17 Aug 2019   Pseudo R-squ.:                  0.1391
Time:                        23:58:55   Log-Likelihood:                -11935.
converged:                       True   LL-Null:                       -13863.
                                        LLR p-value:                     0.000
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         -2.1417      0.050    -43.216      0.000      -2.239      -2.045
x1             0.0055      0.000     32.013      0.000       0.005       0.006
x2             0.0236      0.001     36.465      0.000       0.022       0.025
x3             2.1137      0.104     20.400      0.000       1.911       2.317
==============================================================================

Answer 3

路飛，請記住始終分享您的代碼和您的嘗試，以便我們了解您的嘗試並幫助您。 無論如何，我認為您正在尋找這個：

import numpy as np
from sklearn.linear_model import LogisticRegression

X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]]) #Your x values, for a 2 variable model.
#y = 1 * x_0 + 2 * x_1 + 3 #This is the "true" model
y = np.dot(X, np.array([1, 2])) + 3 #Generating the true y-values
reg = LogisticRegression().fit(X, y) #Fitting the model given your X and y values.
reg.coef_ #Prints an array of all regressor values (b1 and b2, or as many bs as your model has)
reg.intercept_  #Prints value for intercept/b0 
reg.predict(np.array([[3, 5]])) #Predicts an array of y-values with the fitted model given the inputs

Answer 4

查看statsmodels 庫的 Logit 模型。

你會像這樣使用它：

from statsmodels.discrete.discrete_model import Logit
from statsmodels.tools import add_constant

x = [...] # Obesrvations
y = [...] # Response variable

x = add_constant(x)
print(Logit(y, x).fit().summary())

Answer 5

假設您的X是 Pandas DataFrame 並且clf是您的邏輯回歸模型，您可以使用以下代碼行獲取功能的名稱及其值：

pd.DataFrame(zip(X_train.columns, np.transpose(clf.coef_)), columns=['features', 'coef'])

Answer 6

稍微修正一下最后的答案：

pd.DataFrame(zip(X_train.columns, np.transpose(clf.coef_.tolist()[0])), columns=['features', 'coef'])

Answer 7

提供更多細節並展示如何替換 pytorch 模型的最后一層：

#%%
"""
Get the weights & biases to set them to a nn.Linear layer in pytorch
"""
import numpy as np
import torch
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from torch import nn


X, y = load_iris(return_X_y=True)
print(f'{X.shape=}')
print(f'{y.shape=}')
Din: int = X.shape[1]
total_data_set_size: int = X.shape[0]
assert y.shape[0] == total_data_set_size

clf = LogisticRegression(random_state=0).fit(X, y)
out = clf.predict(X[:2, :])
# print(f'{out=}')

out = clf.predict_proba(X[:2, :])
print(f'{out=}')


clf.score(X, y)

# - coef_ndarray of shape (1, n_features) or (n_classes, n_features)
print(f'{clf.coef_.shape=}')
print(f'{clf.intercept_.shape=}')
assert (clf.coef_.shape[1] == Din)
Dout: int = clf.coef_.shape[0]
print(f'{Dout=} which is the number of classes too in classification')
assert (Dout == clf.intercept_.shape[0])

print()
num_classes: int = Dout
mdl = nn.Linear(in_features=Din, out_features=num_classes)
mdl.weight = torch.nn.Parameter(torch.from_numpy(clf.coef_))
mdl.bias = torch.nn.Parameter(torch.from_numpy(clf.intercept_))

out2 = torch.softmax(mdl(torch.from_numpy(X[:2, :])), dim=1)
print(f'{out2=}')

assert np.isclose(out2.detach().cpu().numpy(), out).all()

# -
# module: nn.Module = getattr(base_model, layer_to_replace)
# num_classes: int = clf.coef_[0]  # out_features=Dout
# num_features: int = clf.coef_[1]  # in_features
# assert module.weight.Size() == torch.Size([num_features, num_classes])
# assert module.bias.Size() == torch.Size([num_classes])
# module.weight = torch.nn.Parameter(torch.from_numpy(clf.coef_))
# module.bias = torch.nn.Parameter(torch.from_numpy(clf.intercept_))

在python中查找邏輯回歸的系數

問題描述

7 個解決方案

解決方案1
8 已采納 2019-09-13 13:33:05

解決方案2
4 2019-09-14 23:15:51

解決方案3
1 2019-09-13 13:27:40

解決方案4
1 2019-09-13 13:32:41

解決方案5
0 2020-09-13 11:51:58

解決方案6
0 2020-10-13 21:21:19

解決方案7
0 2021-11-10 20:10:42

在python中查找邏輯回歸的系數

問題描述

7 個解決方案

解決方案1 8 已采納 2019-09-13 13:33:05

解決方案2 4 2019-09-14 23:15:51

解決方案3 1 2019-09-13 13:27:40

解決方案4 1 2019-09-13 13:32:41

解決方案5 0 2020-09-13 11:51:58

解決方案6 0 2020-10-13 21:21:19

解決方案7 0 2021-11-10 20:10:42

解決方案1
8 已采納 2019-09-13 13:33:05

解決方案2
4 2019-09-14 23:15:51

解決方案3
1 2019-09-13 13:27:40

解決方案4
1 2019-09-13 13:32:41

解決方案5
0 2020-09-13 11:51:58

解決方案6
0 2020-10-13 21:21:19

解決方案7
0 2021-11-10 20:10:42