简体   繁体   中英

Getting transformed X values from OLS model using statsmodels

I am trying to do a linear regression. With the results I want to multiply each x with its own estimated coefficient: x i · β i .

However, I am doing a lot of transformations on x i .

For example:

import statsmodels.api as sm
import statsmodels.formula.api as smf
import numpy as np

def log_plus_1(x):
    return np.log(x + 1.0)

df = sm.datasets.get_rdataset("Guerry", "HistData").data
df = df[['Lottery', 'Literacy', 'Wealth', 'Region']].dropna()
formule = 'Lottery ~ pow(Literacy,2) + log_plus_1(Wealth)'
mod = smf.ols(formula=formule, data=df)
res = mod.fit()
res.params

Now I would need pow(Literacy, 2) and log_plus_1(Wealth) . But since they go into the model, I was hoping to get them out of there too. Instead of transforming the data from the original dataset.

In RI would use res$model to get it.

The data is stored as attributes of the model, eg the design matrix is mod.exog , the dependent or response variable is mod.endog .

(I'm not sure I remember correctly the details of the following: The data that patsy returns after creating the transformed design matrix should, in this case, be a pandas DataFrame, and should be stored in mod.data.orig_exog or something like that.)

res.predict automatically handles the transformation, ie patsy uses the formula information to transform the data for the explanatory variables in prediction in the same way as the data was transformed in creating the model.
predict only returns the prediction and not the internally transformed predict exog .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM