[英]How to turn year fixed-effects into decade fixed-effects with plm()?
For my bachelor thesis I have a regression with fixed-effects and time-fixed effects in years:对于我的学士论文,我有一个多年固定效应和时间固定效应的回归:
log(production_it )= β_0 + β_1 * log(temp_it ) + β_2 * log(rain_it ) + β_3 * drought_it + β_4 * flood_it + β_5 * storm_it + β_6 * log(labour_it )+ β_7* log(Fertilitzer_it )+ β_8* log(capital_it )+ β_9* log(area_it )+ η_t+ u_i+ ε_it
log(production_it)= β_0 + β_1 * log(temp_it) + β_2 * log(rain_it) + β_3 *干旱_it + β_4 * flood_it + β_5 *storm_it + β_6 * log(labour_it)+ β_7* log(Fertilitzer_it)+ β_8* log (capital_it )+ β_9* log(area_it )+ η_t+ u_i+ ε_it
where在哪里
i: country, t: year
i: 国家, t: 年
r1.time.fixed <- plm(log(production) ~ log(temp) + log(rain) + drought + flood +
storm + log(labour) + log(fertilizer) + log(capital) +
log(area), data=pm.rich, model="within", effect="twoways")
Now I want to create the following regression with decade instead of year as fixed-effect:现在我想用十年而不是一年作为固定效应创建以下回归:
log(production_id )= β_0 + β_1 * log(temp_id ) + β_2 * log(rain_id ) + β_3 * drought_id + β_4 * flood_id + β_5 * storm_id + β_6 * log(labour_id )+ β_7* log(Fertilitzer_id )+ β_8* log(capital_id )+ β_9* log(area_id )+ η_d+ u_i+ ε_id
日志(生产标识)= β_0 + β_1 * 日志(临时标识)+ β_2 * 日志(雨标识)+ β_3 * 干旱标识 + β_4 * 洪水标识 + β_5 * 风暴标识 + β_6 * 日志(劳动力标识)+ β_7* 日志(肥料标识)+ β_8* 日志(capital_id )+ β_9* log(area_id )+ η_d+ u_i+ ε_id
where在哪里
i:country, d: decade
i:国家,d:十年
How can I create the decade fixed-effects in r, given a panel data set based on year data?给定基于年份数据的面板数据集,如何在 r 中创建十年固定效应?
Here you can find my data I use:在这里你可以找到我使用的数据:
First, you want the averages for each country in each decade as single observations, where you probably want the mean
values for each country in each decade.首先,您希望每个国家/地区在每个十年中的平均值作为单个观察值,您可能希望每个国家/地区在每个十年中的
mean
。 We do this outside plm
using aggregate
, where we paste0
together the first three substr
ings of the years with a zero, eg 193
5
→ 1930. Let me show you with the Grunfeld
data that comes with plm
:我们在
plm
外部使用aggregate
执行此操作,其中我们将年份的前三paste0
substr
与零一起粘贴 0,例如 193
5
→ 1930。让我向您展示plm
附带的Grunfeld
数据:
library(plm)
data(Grunfeld)
Grunfeld <- transform(Grunfeld, decade=paste0(substr(year, 1, 3), "0"))
head(Grunfeld, 3)
# firm year inv value capital decade
# 1 1 1935 317.6 3078.5 2.8 1930
# 2 1 1936 391.8 4661.7 52.6 1930
# 3 1 1937 410.6 5387.1 156.9 1930
dim(Grunfeld)
# [1] 200 6
This allows us to aggregate
along the decades:这使我们能够
aggregate
几十年:
Grunfeld.a <- aggregate(. ~ firm + decade, Grunfeld, mean)
head(Grunfeld.a, 3)
# firm decade year inv value capital
# 1 1 1930 1937 341.70 4046.54 124.98
# 2 2 1930 1937 305.56 1921.00 159.06
# 3 3 1930 1937 49.60 2057.12 129.80
dim(Grunfeld.a)
# [1] 30 6
Now we could just put the aggregated Grunfeld.a
into the original plm
call, since plm
internally does some "magic" to recognize the unit and time variables.现在我们可以将聚合的
Grunfeld.a
放入原始plm
调用中,因为plm
在内部做了一些“魔术”来识别单位和时间变量。 However, I consider that as dangerous and recommend to explicitly state the index=
es in the plm
call (see an earlier answer of mine for a comprehensive explanation):但是,我认为这是危险的,并建议在
plm
调用中明确 state index=
es (有关全面解释,请参阅我的早期答案):
## unit and year FE
fit.year <- plm(inv ~ value + capital, data=Grunfeld, index=c("firm", "year"),
model="within", effect="twoways")
## unit and decade FE
fit.decade <- plm(inv ~ value + capital, data=Grunfeld.a, index=c("firm", "decade"),
model="within", effect="twoways")
For the statistical summary you may want to use robust standard errors clustered by unit (ie by country in your case, or firm in this example).对于统计摘要,您可能希望使用按单位聚类的稳健标准误差(即在您的情况下按国家或在本例中按公司)。
plm
comes with a summary.plm
method that allows to customize the vcov=
using vcovHC.plm
. plm
带有一个summary.plm
方法,允许使用vcovHC.plm
自定义vcov=
。 Note that cluster=c("group")
confusingly means that you want to cluster by the variable you defined as unit variable, (ie country in your case).请注意,
cluster=c("group")
令人困惑地意味着您希望按您定义为单位变量的变量(即您的情况下的国家/地区)进行聚类。 For the estimation type=
we might want to use "HC3"
as it is standard in the sandwich
package and often recommended today.对于估计
type=
,我们可能希望使用"HC3"
,因为它是sandwich
package 中的标准,并且在今天经常被推荐。
## unit and year FE
summary(fit.year, vcov=vcovHC(fit.year, type="HC3", cluster=c("group")))$coe
# Estimate Std. Error t-value Pr(>|t|)
# value 0.1177159 0.01212638 9.70742 5.539985e-18
# capital 0.3579163 0.05915972 6.05000 9.049182e-09
## unit and decade FE
summary(fit.decade, vcov=vcovHC(fit.decade, type="HC3", cluster=c("group")))$coe
# Estimate Std. Error t-value Pr(>|t|)
# value 0.1541480 0.05601229 2.752039 1.417452e-02
# capital 0.3476384 0.06517729 5.333735 6.719637e-05
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.