简体   繁体   中英

R studio: Panel regression and clustering by group and time?

I have some panel data and am using the PLM package. I would like to cluster standard errors by group and time. However, I have only been able to only cluster at one level and not both.

Data example:

Date   Country  Stock_Returns House_Prices
 1990   Japan     11.84         1000.00
 1991   Japan     5.65          759.6
 1990   USA       -6.45         2831.90
 1991   USA       9.78           532.63

An example of my regression

Reg1 <- plm(Stock_Returns ~ House_Prices, data =DF1, index=c("Country", "Date"), model="within)

Here is my current approach for clustering by time, but I cannot figure out how to do "time" "group"?

x <- coeftest(Reg1, function(x), vcovHC(x, type="sss", cluster="time"))

Any help is appreciated

You could use lfe::felm . The formula is y ~ x1 + x2 | f1 + f2 | (Q|W ~ x3+x4) | clu1 + clu2 y ~ x1 + x2 | f1 + f2 | (Q|W ~ x3+x4) | clu1 + clu2 y ~ x1 + x2 | f1 + f2 | (Q|W ~ x3+x4) | clu1 + clu2 , where f = fixed effect and clu =cluster.

library(lfe)
fit <- felm(Stock_Returns ~ House_Prices | Country + Date | 0 | Date + Country, data=DF1)
summary(fit)
# Call:
#   felm(formula = Stock_Returns ~ House_Prices | Country + Date |      0 | Date + Country, data = DF1) 
# 
# Residuals:
#   Min       1Q   Median       3Q      Max 
# -1.23404 -0.47836  0.03347  0.37293  1.94425 
# 
# Coefficients:
#   Estimate Cluster s.e. t value Pr(>|t|)   
# House_Prices  0.55393      0.08945   6.193  0.00848 **
#   ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.7618 on 29 degrees of freedom
# Multiple R-squared(full model): 0.7174   Adjusted R-squared: 0.581 
# Multiple R-squared(proj model): 0.5377   Adjusted R-squared: 0.3145 
# F-statistic(full model, *iid*):5.258 on 14 and 29 DF, p-value: 8.098e-05 
# F-statistic(proj model): 38.35 on 1 and 3 DF, p-value: 0.008482 

Note: You may obtain few (<50) clusters and possibly need to bootstrap your standard errors with special methods. You may want to read Cameron et al. 2015 and consult a local statistician.


Data

set.seed(42)
DF1 <- expand.grid(Date=1990:2000, Country=c("Japan", "USA", "Germany", "Mexico"))
DF1 <- within(DF1, {
  Stock_Returns <- rnorm(nrow(DF1), 20)
  House_Prices <- abs(rnorm(nrow(DF1), 5000)) + Stock_Returns
    })

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM