简体   繁体   中英

Statsmodels Linear regression: set value for a variable

I have been using the OLS part of Statsmodels to determine some variables for a set of measurements. The basic format is nothing special, example shown below.

model = smf.ols('Out ~ l0 + l1 + l2 + l3 - 1', data = df_results).fit()

I have run the model and have values for all the variables (L0, L1, L2, etc). Essentially I'm looking for a function or some tool (which I can't seem to find) where I can set a value for L3 and then determine what the new values for L0, L1 and L2 are, for the same set of measurements.

The simplest way is to subtract the known term from the dependent variable.

offset = b_known * x_i
y_diff = y - offset

and then regress y_diff on the remaining explanatory variables, ie in this case

res = smf.ols('Out_diff ~ l0 + l1 + l2 - 1', data = df_results).fit()

This is not possible in nonlinear models, because we cannot just move a known term from the right to the left side. Therefore, some nonlinear models like GLM and the discrete count variable models take an offset argument, which is essentially a explanatory variable with a fixed coefficient equal to 1.

This means that the above is equivalent to

res_glm = smf.glm('Out ~ l0 + l1 + l2 - 1', data = df_results, offset=offset).fit()

with offset defined above. The default family in glm is gaussian with identity as default link.

More general linear (or affine) restrictions require reparameterization of the design matrix additional to using the offset. This is currently only available for GLM in fit_constrained, see for example

Constrained regression in Python and How to add sum to zero constraint to GLM in Python?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM