[英]plm regression fixed effects on two variables,
我有以下简化的df:
problem <- data.frame(
stringsAsFactors = FALSE,
fkeycompany = c("0000001961",
"0000003570","0000003570","0000003570",
"0000003570","0000003570","0000003570",
"0000003570","0000004187","0000004187","0000004187",
"0000004187","0000016058","0000022872",
"0000022872","0000022872","0000022872","0000024071",
"0000050471","0000052971","0000052971",
"0000056679","0000058592","0000058592","0000058592",
"0000063330","0000099047","0000099047",
"0000099047","0000316206","0000316537",
"0000319697","0000351917","0000351917","0000351917",
"0000356037","0000356037","0000356037",
"0000700815","0000700815","0000700815","0000700815",
"0000704415","0000704415","0000704415",
"0000705003","0000720154","0000720154","0000720154",
"0000720154"),
fiscalyear = c(2018,2002,
2002,2004,2006,2007,2007,2014,2005,2005,
2009,2017,2003,2002,2004,2004,2010,2002,
2016,2008,2008,2002,2005,2005,2010,2014,
2000,2005,2005,2002,2002,2001,2005,2005,
2006,2007,2012,2015,2006,2006,2007,2008,
2003,2014,2014,2000,2004,2006,2008,2013),
zmijewskiscore = c(-0.295998372490631,-3.0604522838509,-3.0604522838509,
-9.70437199970406,-0.836774487816746,
0.500903351523752,0.500903351523752,-1.29210741224579,
-1.96529713996165,-1.96529713996165,
-1.60831783946871,-2.12343231229296,-3.99767761748961,
0.561261861396196,4.13793269655047,4.13793269655047,
5.61803398400963,-0.000195582736436772,
-3.93766039340527,-0.540037039625719,
-0.540037039625719,-1.93767533120689,-4.54446419505987,
-4.54446419505987,1.94389244672183,
0.941272649148121,-3.88427264672157,-0.342812414189714,
-0.342812414189714,-1.35074505582686,
-4.52746658422071,-0.130671284507204,-0.223517713694019,
-0.223517713694019,0.0149617517859735,
-2.95100357094774,-2.55146691134187,-1.86846592111008,
2.92283100206773,2.92283100206773,
4.65325023636937,6.1585365469118,-4.54449586848866,
-1.49969162335521,-1.49969162335521,-3.34071706450412,
-1.72382101559976,-1.53076052307727,
-1.77582320023177,-1.57280701642882),
lloss = c(0,1,1,1,1,
1,1,1,0,0,0,1,0,0,1,1,1,1,0,1,1,
1,0,0,1,0,0,1,1,0,0,1,1,1,1,0,0,
1,1,1,1,1,0,1,1,0,1,1,1,0),
GCO_prev = c(1,1,1,0,0,
0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0),
GCO = c(0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,
0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,
0,0,0,1,1,0,0,0,0,0,0,0,0),
industry = c(9,5,5,5,5,
5,5,5,6,6,6,6,9,9,9,9,9,6,9,6,6,
9,8,8,8,8,9,9,9,9,8,9,5,5,5,9,9,
9,6,6,6,6,9,9,9,9,9,9,9,9))
我想对此进行 plm 回归,并对年份和行业产生固定影响。
library(plm)
summary(plm(GCO ~ GCO_prev + lloss + zmijewskiscore, index=c("fiscalyear", "industry"), data=problem, model="within" ))
但是,我在运行时收到此错误:
Error in pdim.default(index[[1L]], index[[2L]]) :
duplicate couples (id-time)
In addition: Warning message:
In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")
我不太清楚如何解决这个问题。 如果我假设正确,这与公司( fkeycompany
代码)多于 1 家有关,例如industry
= 9, fiscalyear
= 2003 年。 因此,对于某些行业,比如说 9,有更多行(fkeycompanies,在本例中为 0000016058 和 0000704415)包含 2003 年(或者至少,这就是我认为的问题,还是我错了?)。 这是我在我的主要数据集中相信的更多行业和年份的问题。 如何修复此错误消息?
另外,除了这个问题,我正在运行的代码是否正确? 我是否确实随着年份和行业影响而倒退?
给定您的数据,面板数据的观察单位是公司( fkeycompany
)。 您可能希望将行业添加为另一个固定效应,但它肯定不是时间索引(时间索引进入参数index
的第二个位置,我假设它是fiscalyear
)。 有很多问题可以回答该主题。 此外,请务必先阅读软件包,其中解释了index
参数的数据规范。
我建议先转换为 pdata.frame。
但是,有 fkeycompany 和会计年度的双重星座,请参见下面的代码,其中值 > 1 的table
的输出提示您组合。
library(plm)
pdat.problem <- pdata.frame(problem, index = c("fkeycompany", "fiscalyear"))
#> Warning in pdata.frame(problem, index = c("fkeycompany", "fiscalyear")): duplicate couples (id-time) in resulting pdata.frame
#> to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")
table(index(pdat.problem), useNA = "ifany")
#> fiscalyear
#> fkeycompany 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2012 2013
#> 0000001961 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000003570 0 0 2 0 1 0 1 2 0 0 0 0 0
#> 0000004187 0 0 0 0 0 2 0 0 0 1 0 0 0
#> 0000016058 0 0 0 1 0 0 0 0 0 0 0 0 0
#> 0000022872 0 0 1 0 2 0 0 0 0 0 1 0 0
#> 0000024071 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000050471 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000052971 0 0 0 0 0 0 0 0 2 0 0 0 0
#> 0000056679 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000058592 0 0 0 0 0 2 0 0 0 0 1 0 0
#> 0000063330 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000099047 1 0 0 0 0 2 0 0 0 0 0 0 0
#> 0000316206 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000316537 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000319697 0 1 0 0 0 0 0 0 0 0 0 0 0
#> 0000351917 0 0 0 0 0 2 1 0 0 0 0 0 0
#> 0000356037 0 0 0 0 0 0 0 1 0 0 0 1 0
#> 0000700815 0 0 0 0 0 0 2 1 1 0 0 0 0
#> 0000704415 0 0 0 1 0 0 0 0 0 0 0 0 0
#> 0000705003 1 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000720154 0 0 0 0 1 0 1 0 1 0 0 0 1
#> fiscalyear
#> fkeycompany 2014 2015 2016 2017 2018
#> 0000001961 0 0 0 0 1
#> 0000003570 1 0 0 0 0
#> 0000004187 0 0 0 1 0
#> 0000016058 0 0 0 0 0
#> 0000022872 0 0 0 0 0
#> 0000024071 0 0 0 0 0
#> 0000050471 0 0 1 0 0
#> 0000052971 0 0 0 0 0
#> 0000056679 0 0 0 0 0
#> 0000058592 0 0 0 0 0
#> 0000063330 1 0 0 0 0
#> 0000099047 0 0 0 0 0
#> 0000316206 0 0 0 0 0
#> 0000316537 0 0 0 0 0
#> 0000319697 0 0 0 0 0
#> 0000351917 0 0 0 0 0
#> 0000356037 0 1 0 0 0
#> 0000700815 0 0 0 0 0
#> 0000704415 2 0 0 0 0
#> 0000705003 0 0 0 0 0
#> 0000720154 0 0 0 0 0
修复后,您将能够按照这些思路运行模型。 对于时间固定效应模型:
model <- plm(GCO ~ GCO_prev + lloss + zmijewskiscore, data = pdat.problem, model="within", effect = "time")
或以industry
作为附加固定效应的时间固定效应:
model2 <- plm(GCO ~ GCO_prev + lloss + zmijewskiscore + factor(industry), data = pdat.problem, model="within", effect = "time")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.