[英]Get the coefficients of my sklearn polynomial regression model in Python
[英]get beta coefficients of regression model in Python
dataset = pd.read_excel('dfmodel.xlsx')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
from sklearn.metrics import r2_score
print('The R2 score of Multi-Linear Regression model is: ',r2_score(y_test,y_pred))
使用上面的代碼,我設法進行了線性回歸並獲得了 R2。 如何獲得每個預測變量的 beta 系數?
從sklearn.linear_model.LinearRegression文檔頁面,您可以在找到系數(斜率)和截距regressor.coef_
和regressor.intercept_
分別。
如果您在擬合模型之前使用sklearn.preprocessing.StandardScaler ,那么回歸系數應該是您正在尋找的 Beta 系數。
就個人而言,我更喜歡指定 1 度的 np.polyfit() 單步。
import numpy as np
np.polyfit(X,y,1)[0] #returns beta + other coeffs if > 1 degree.
所以你的問題,如果我理解,你希望根據初始 y 計算預測的 y 值 - 將是這樣的:
np.polyfit(y_test,y_pred,1)[0]
不過,我會測試 np.polyfit(x_test,y_pred)[0] 。
使用regressor.coef_
。 通過與statsmodels
實現進行比較,您可以看到這些系數如何按預測變量的順序映射:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression(fit_intercept=False)
regressor.fit(X, y)
regressor.coef_
# array([0.43160901, 0.42441214])
statsmodels
版本:
import statsmodels.api as sm
sm.add_constant(X)
mod = sm.OLS(y, X)
res = mod.fit()
print(res.summary())
OLS Regression Results
=======================================================================================
Dep. Variable: y R-squared (uncentered): 0.624
Model: OLS Adj. R-squared (uncentered): 0.623
Method: Least Squares F-statistic: 414.0
Date: Tue, 29 Sep 2020 Prob (F-statistic): 1.25e-106
Time: 17:03:27 Log-Likelihood: -192.54
No. Observations: 500 AIC: 389.1
Df Residuals: 498 BIC: 397.5
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 0.4316 0.041 10.484 0.000 0.351 0.512
x2 0.4244 0.041 10.407 0.000 0.344 0.505
==============================================================================
Omnibus: 36.830 Durbin-Watson: 1.967
Prob(Omnibus): 0.000 Jarque-Bera (JB): 13.197
Skew: 0.059 Prob(JB): 0.00136
Kurtosis: 2.213 Cond. No. 2.57
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
您可以使用以下方法進行直接等效性測試:
np.array([regressor.coef_.round(8) == res.params.round(8)]).all() # True
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.