简体   繁体   中英

Is there a statsmodel formula equivalent of the R glm library for y ~ .?

I have a dataframe containing the following columns:

y as the dependent variable
A, B, C, D, E, F as the independent variables.

I want to make a regression using the statsmodels module and I don't want to express the formula argument as follows:

formula = 'y ~ A + B + C + D + E + F'

R glm library does have a simplification by expressing formula = y ~.

I was wondering if statsmodel shortcut as there is one for the glm library in R.

PS: the actual dataframe that I'm working has 27 variables

There is no shortcut like "." in patsy formula handling which is used by statsmodels.

However, python string manipulation is simple.

An example that I'm currently using, DATA is my dataframe, docvis is the outcome variable, and I have a constant column that is not needed in the formula.

formula = "docvis ~ " + " + ".join([i for i in DATA if i not in ["docvis", "const"]])
formula
'docvis ~ offer + ssiratio + age + educyr + physician + nonphysician + medicaid + private + female + phylim + actlim + income + totchr + insured + age2 + linc + bh + ldocvis + ldocvisa + docbin + aget + aget2 + incomet'

More explicit would be to use column names directly DATA.columns .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM