简体   繁体   中英

python : linear regression with fixed effects (adapting Stata code)

I'm trying to replicate code from Stata that estimates a linear regression model.

The problem is that there are 2 fixed effects variables (Domaine d'étude EF Université EF).

The linear regression with fixed effects and control variables

Here is what I have for the moment :

import statsmodels.formula.api as smf
results = smf.ols('discriminant ~ diff_eval_formfr + presse + trav_sup + recrut_seul + proced_auditions + taux_insertion_30mois + taux_stable_30 + taux_plei_30 + sal_med', data=da).fit()

I don't know how to add the fixed effects or even if it is possible.

Any advice will be appreciated.

If the fixed effect variable is a categorical string variable you can just include it in the equation. statsmodels will convert each string value to a dummy and include it in the regression.

If the fixed effect variable is numeric you have to tell statsmodels to interpret the numeric values as categories and not numbers by putting the name in C() .

Let's say you have one string fixed effect variable ( fe1 ) and one numeric fixed effect variable ( fe2 ). Then you can add them like this:

import statsmodels.formula.api as smf
results = smf.ols('discriminant ~ diff_eval_formfr + presse + trav_sup + recrut_seul + proced_auditions + taux_insertion_30mois + taux_stable_30 + taux_plei_30 + sal_med + fe1 + C(fe2)', data=da).fit()

Note that this includes the fixed effect variables as a set of dummies for each value. This is how fixed effects are mathematically included in regressions. This is the same in Stata, but most fixed effect options in Stata remove the estimates of the fixed effects from the results table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM