简体   繁体   中英

Applying Fixed Effects in a Difference-In-Differences Estimation using Least Square Dummy Variable Approach in R

I am trying to do a Difference-In-Differences Regression with Fixed Effects. The regression is meant to estimate the impact of participating in a televised Sports Event on the Social Media Follower Count of the participating Teams, compared to other Teams that did not participate.

My Data looks like this: [Data][1]

The dependent variable is the Rate_Percent, which is the growth rate of Facebook-Likes, which is calculated as follows

Dataset_FB <- Dataset_FB %>% group_by(ID) %>%  
     mutate(Diff_Growth = FBLikes - lag(FBLikes),
     Rate_Percent = Diff_Growth / lag(FBLikes) * 100)

Teilnahme is a Dummy Variable to tell the Participants from the non-Participants, and Hauptrunde is a Dummy Variable to indicate the time frame of the treatment (0 before the treatment, 1 after the treatment). I am trying to include the ID, Uhrzeit and Spieltag as fixed effects to control for Club- and Time- differences.

My regression looks like this:

reg <- lm (Rate_Percent ~ Teilnahme + Hauptrunde + Teilnahme*Hauptrunde + factor(ID) + factor(Uhrzeit) + factor(Spieltag), data=Dataset_FB)

Now, my questions are as follows:

  1. The summary looks far from correct, but I can't find my mistakes, what did I do wrong?
  2. Is this the correct way to use fixed effects?
  3. I know "Coefficients: (6 not defined because of singularities)" indicates a strong correlation between my independent variables. But since they are dummies, are they not always correlated?

The summary looks like this:

lm(formula = Rate_Percent ~ Teilnahme + Hauptrunde + Teilnahme * 
 Hauptrunde + factor(ID) + factor(Uhrzeit) + factor(Spieltag), 
 data = Dataset_FB)

Residuals:
 Min      1Q  Median      3Q     Max 
-0.2834 -0.0343 -0.0111  0.0092  4.9302 

Coefficients: (6 not defined because of singularities)
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)           0.0266970  0.0125098   2.134  0.03288 *  
Teilnahme             0.0020571  0.1662742   0.012  0.99013    
Hauptrunde           -0.0158433  0.0060631  -2.613  0.00900 ** 
factor(ID)8          -0.0344717  0.0171467  -2.010  0.04443 *  
factor(ID)25         -0.0155100  0.1662745  -0.093  0.92568    
factor(ID)56          0.0122209  0.0171467   0.713  0.47604    
factor(ID)69         -0.0093248  0.1662745  -0.056  0.95528    
factor(ID)90         -0.0037743  0.0171467  -0.220  0.82578    
factor(ID)93          0.0948638  0.0171467   5.532 3.29e-08 ***
factor(ID)103         0.0117689  0.0171467   0.686  0.49251    
factor(ID)115         0.0479442  0.0171467   2.796  0.00519 ** 
factor(ID)166        -0.0129542  0.0171467  -0.755  0.44998    
factor(ID)364        -0.0112018  0.0171467  -0.653  0.51359    
factor(ID)373        -0.0111296  0.0171467  -0.649  0.51631    
factor(ID)490        -0.0231408  0.0171467  -1.350  0.17720    
factor(ID)752        -0.0064241  0.0171467  -0.375  0.70793    
factor(ID)907         0.1333400  0.0171467   7.776 8.75e-15 ***
factor(ID)951         0.0087327  0.0171467   0.509  0.61057    
factor(ID)996        -0.0105943  0.0171467  -0.618  0.53669    
factor(ID)1238        0.0076285  0.0171467   0.445  0.65641    
factor(ID)1315        0.0304732  0.1662745   0.183  0.85459    
factor(ID)1316        0.1290605  0.0171467   7.527 5.98e-14 ***
factor(ID)1400        0.0038137  0.0171467   0.222  0.82400    
factor(ID)1401       -0.0135700  0.0171467  -0.791  0.42874    
factor(ID)1712       -0.0001285  0.0171467  -0.007  0.99402    
factor(ID)3417        0.0053766  0.0171467   0.314  0.75386    
factor(ID)5646        0.0052521  0.0171467   0.306  0.75939    
factor(ID)6273       -0.0134096  0.0171467  -0.782  0.43422    
factor(ID)7679       -0.0104365  0.0171467  -0.609  0.54277    
factor(ID)9029               NA         NA      NA       NA    
factor(ID)10213      -0.0441121  0.0171467  -2.573  0.01012 *  
factor(ID)26957      -0.0287541  0.0171700  -1.675  0.09405 .  
factor(ID)29988       0.1015109  0.1662745   0.611  0.54155    
factor(ID)40373       0.0203831  0.0171467   1.189  0.23459    
factor(Uhrzeit)1530   0.0206731  0.1653880   0.125  0.90053    
factor(Uhrzeit)1830          NA         NA      NA       NA    
factor(Uhrzeit)2045          NA         NA      NA       NA    
factor(Spieltag)NA           NA         NA      NA       NA    
factor(Spieltag)Sa           NA         NA      NA       NA    
factor(Spieltag)So           NA         NA      NA       NA    
Teilnahme:Hauptrunde  0.0053874  0.0085752   0.628  0.52987    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1649 on 5885 degrees of freedom
(32 observations deleted due to missingness)
Multiple R-squared:  0.07278,   Adjusted R-squared:  0.06742 
F-statistic: 13.59 on 34 and 5885 DF,  p-value: < 2.2e-16





[1]: https://i.stack.imgur.com/ZBAqL.png
  1. The output is correct and you did nothing wrong per se , but there are more elegant ways to run the fixed effects regression.

  2. Yes, although the fixed effects will not be consistently estimated in the model.

  3. Here, the singularities means that you have observations in ID , Uhrzeit and Spieltag where you have only one unique observation, so the model cannot estimate a coefficient for these.

I would suggest having a look into two packages:

  1. plm , which is the standard for panel data models. I am not 100% sure if your data is a real panel (and whether you are actually estimating a diff-in-diff specification).

You would have something like:

data <- pdata.frame(data, index=c("ID", "Uhrzeit"))
plm(formula = Rate_Percent ~ Teilnahme + Hauptrunde + Teilnahme * 
 Hauptrunde + factor(Spieltag), data=Dataset_FB, model = "within", effect = "twoways", index = c("ID","Uhrzeit"))

  1. felm is a great and easy to use alternative, where you specify the factor variables after a | in the formula.
est <- felm(Rate_Percent ~ Teilnahme + Hauptrunde + Teilnahme * 
 Hauptrunde  | ID + Uhrzeit + Spieltag, data = Dataset_FB)

As an explanation: The way these fixed-effects packages work is they first transform your data (with a within-transformation), basically taking out the averages of the groups. Thanks to this, we don't actually need to estimate the coefficients for the fixed effects, as done in your code. So these other solutions are slightly more neat and produce easier-to-read output, but numerically, there shouldn't be any difference.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM