简体   繁体   中英

Multi Variable Regression statsmodels.api

I've looked through the documentation and still can't figure this out. I want to run a WLS with multiple regressions.

statsmodels.api is imported as sm

Example of single variable.

X = Height
Y = Weight

res = sm.OLS(Y,X,).fit() 
res.summary()

Say I also have:

X2 = Age

How do I add this into my regresssion?

You can put them into a data.frame and call out the columns (this way the output looks nicer too):

import statsmodels.api as sm
import pandas as pd
import numpy as np

Height = np.random.uniform(0,1,100)
Weight = np.random.uniform(0,1,100)
Age = np.random.uniform(0,30,100)

df = pd.DataFrame({'Height':Height,'Weight':Weight,'Age':Age})

res = sm.OLS(df['Height'],df[['Weight','Age']]).fit()

In [10]: res.summary()
Out[10]: 
<class 'statsmodels.iolib.summary.Summary'>
"""
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                 Height   R-squared (uncentered):                   0.700
Model:                            OLS   Adj. R-squared (uncentered):              0.694
Method:                 Least Squares   F-statistic:                              114.3
Date:                Mon, 24 Aug 2020   Prob (F-statistic):                    2.43e-26
Time:                        15:54:30   Log-Likelihood:                         -28.374
No. Observations:                 100   AIC:                                      60.75
Df Residuals:                      98   BIC:                                      65.96
Df Model:                           2                                                  
Covariance Type:            nonrobust                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Weight         0.1787      0.090      1.988      0.050       0.000       0.357
Age            0.0229      0.003      8.235      0.000       0.017       0.028
==============================================================================
Omnibus:                        2.938   Durbin-Watson:                   1.813
Prob(Omnibus):                  0.230   Jarque-Bera (JB):                2.223
Skew:                          -0.211   Prob(JB):                        0.329
Kurtosis:                       2.404   Cond. No.                         49.7
==============================================================================

I use a 2nd order polynomial to predict how height and age affect weight for a soldier. You can pick up ansur_2_m.csv on my GitHub.

 df=pd.read_csv('ANSUR_2_M.csv', encoding = "ISO-8859-1",   usecols=['Weightlbs','Heightin','Age'],  dtype={'Weightlbs':np.integer,'Heightin':np.integer,'Age':np.integer})
 df=df.dropna()
 df.reset_index()
 df['Heightin2']=df['Heightin']**2
 df['Age2']=df['Age']**2

 formula="Weightlbs ~ Heightin+Heightin2+Age+Age2"
 model_ols = smf.ols(formula,data=df).fit()
 minHeight=df['Heightin'].min()
 maxHeight=df['Heightin'].max()
 avgAge = df['Age'].median()
 print(minHeight,maxHeight,avgAge)

 df2=pd.DataFrame()

 df2['Heightin']=np.linspace(60,100,50)
 df2['Heightin2']=df2['Heightin']**2
 df2['Age']=28
 df2['Age2']=df['Age']**2

 df3=pd.DataFrame()
 df3['Heightin']=np.linspace(60,100,50)
 df3['Heightin2']=df2['Heightin']**2
 df3['Age']=45
 df3['Age2']=df['Age']**2

 prediction28=model_ols.predict(df2)
 prediction45=model_ols.predict(df3)

 plt.clf()
 plt.plot(df2['Heightin'],prediction28,label="Age 28")
 plt.plot(df3['Heightin'],prediction45,label="Age 45")
 plt.ylabel="Weight lbs"
 plt.xlabel="Height in"
 plt.legend()
 plt.show()

 print('A 45 year old soldier is more probable to weight more than an 28 year old soldier')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM