[英]plm regression fixed effects on two variables,
我有以下簡化的df:
problem <- data.frame(
stringsAsFactors = FALSE,
fkeycompany = c("0000001961",
"0000003570","0000003570","0000003570",
"0000003570","0000003570","0000003570",
"0000003570","0000004187","0000004187","0000004187",
"0000004187","0000016058","0000022872",
"0000022872","0000022872","0000022872","0000024071",
"0000050471","0000052971","0000052971",
"0000056679","0000058592","0000058592","0000058592",
"0000063330","0000099047","0000099047",
"0000099047","0000316206","0000316537",
"0000319697","0000351917","0000351917","0000351917",
"0000356037","0000356037","0000356037",
"0000700815","0000700815","0000700815","0000700815",
"0000704415","0000704415","0000704415",
"0000705003","0000720154","0000720154","0000720154",
"0000720154"),
fiscalyear = c(2018,2002,
2002,2004,2006,2007,2007,2014,2005,2005,
2009,2017,2003,2002,2004,2004,2010,2002,
2016,2008,2008,2002,2005,2005,2010,2014,
2000,2005,2005,2002,2002,2001,2005,2005,
2006,2007,2012,2015,2006,2006,2007,2008,
2003,2014,2014,2000,2004,2006,2008,2013),
zmijewskiscore = c(-0.295998372490631,-3.0604522838509,-3.0604522838509,
-9.70437199970406,-0.836774487816746,
0.500903351523752,0.500903351523752,-1.29210741224579,
-1.96529713996165,-1.96529713996165,
-1.60831783946871,-2.12343231229296,-3.99767761748961,
0.561261861396196,4.13793269655047,4.13793269655047,
5.61803398400963,-0.000195582736436772,
-3.93766039340527,-0.540037039625719,
-0.540037039625719,-1.93767533120689,-4.54446419505987,
-4.54446419505987,1.94389244672183,
0.941272649148121,-3.88427264672157,-0.342812414189714,
-0.342812414189714,-1.35074505582686,
-4.52746658422071,-0.130671284507204,-0.223517713694019,
-0.223517713694019,0.0149617517859735,
-2.95100357094774,-2.55146691134187,-1.86846592111008,
2.92283100206773,2.92283100206773,
4.65325023636937,6.1585365469118,-4.54449586848866,
-1.49969162335521,-1.49969162335521,-3.34071706450412,
-1.72382101559976,-1.53076052307727,
-1.77582320023177,-1.57280701642882),
lloss = c(0,1,1,1,1,
1,1,1,0,0,0,1,0,0,1,1,1,1,0,1,1,
1,0,0,1,0,0,1,1,0,0,1,1,1,1,0,0,
1,1,1,1,1,0,1,1,0,1,1,1,0),
GCO_prev = c(1,1,1,0,0,
0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0),
GCO = c(0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,
0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,
0,0,0,1,1,0,0,0,0,0,0,0,0),
industry = c(9,5,5,5,5,
5,5,5,6,6,6,6,9,9,9,9,9,6,9,6,6,
9,8,8,8,8,9,9,9,9,8,9,5,5,5,9,9,
9,6,6,6,6,9,9,9,9,9,9,9,9))
我想對此進行 plm 回歸,並對年份和行業產生固定影響。
library(plm)
summary(plm(GCO ~ GCO_prev + lloss + zmijewskiscore, index=c("fiscalyear", "industry"), data=problem, model="within" ))
但是,我在運行時收到此錯誤:
Error in pdim.default(index[[1L]], index[[2L]]) :
duplicate couples (id-time)
In addition: Warning message:
In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")
我不太清楚如何解決這個問題。 如果我假設正確,這與公司( fkeycompany
代碼)多於 1 家有關,例如industry
= 9, fiscalyear
= 2003 年。 因此,對於某些行業,比如說 9,有更多行(fkeycompanies,在本例中為 0000016058 和 0000704415)包含 2003 年(或者至少,這就是我認為的問題,還是我錯了?)。 這是我在我的主要數據集中相信的更多行業和年份的問題。 如何修復此錯誤消息?
另外,除了這個問題,我正在運行的代碼是否正確? 我是否確實隨着年份和行業影響而倒退?
給定您的數據,面板數據的觀察單位是公司( fkeycompany
)。 您可能希望將行業添加為另一個固定效應,但它肯定不是時間索引(時間索引進入參數index
的第二個位置,我假設它是fiscalyear
)。 有很多問題可以回答該主題。 此外,請務必先閱讀軟件包,其中解釋了index
參數的數據規范。
我建議先轉換為 pdata.frame。
但是,有 fkeycompany 和會計年度的雙重星座,請參見下面的代碼,其中值 > 1 的table
的輸出提示您組合。
library(plm)
pdat.problem <- pdata.frame(problem, index = c("fkeycompany", "fiscalyear"))
#> Warning in pdata.frame(problem, index = c("fkeycompany", "fiscalyear")): duplicate couples (id-time) in resulting pdata.frame
#> to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")
table(index(pdat.problem), useNA = "ifany")
#> fiscalyear
#> fkeycompany 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2012 2013
#> 0000001961 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000003570 0 0 2 0 1 0 1 2 0 0 0 0 0
#> 0000004187 0 0 0 0 0 2 0 0 0 1 0 0 0
#> 0000016058 0 0 0 1 0 0 0 0 0 0 0 0 0
#> 0000022872 0 0 1 0 2 0 0 0 0 0 1 0 0
#> 0000024071 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000050471 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000052971 0 0 0 0 0 0 0 0 2 0 0 0 0
#> 0000056679 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000058592 0 0 0 0 0 2 0 0 0 0 1 0 0
#> 0000063330 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000099047 1 0 0 0 0 2 0 0 0 0 0 0 0
#> 0000316206 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000316537 0 0 1 0 0 0 0 0 0 0 0 0 0
#> 0000319697 0 1 0 0 0 0 0 0 0 0 0 0 0
#> 0000351917 0 0 0 0 0 2 1 0 0 0 0 0 0
#> 0000356037 0 0 0 0 0 0 0 1 0 0 0 1 0
#> 0000700815 0 0 0 0 0 0 2 1 1 0 0 0 0
#> 0000704415 0 0 0 1 0 0 0 0 0 0 0 0 0
#> 0000705003 1 0 0 0 0 0 0 0 0 0 0 0 0
#> 0000720154 0 0 0 0 1 0 1 0 1 0 0 0 1
#> fiscalyear
#> fkeycompany 2014 2015 2016 2017 2018
#> 0000001961 0 0 0 0 1
#> 0000003570 1 0 0 0 0
#> 0000004187 0 0 0 1 0
#> 0000016058 0 0 0 0 0
#> 0000022872 0 0 0 0 0
#> 0000024071 0 0 0 0 0
#> 0000050471 0 0 1 0 0
#> 0000052971 0 0 0 0 0
#> 0000056679 0 0 0 0 0
#> 0000058592 0 0 0 0 0
#> 0000063330 1 0 0 0 0
#> 0000099047 0 0 0 0 0
#> 0000316206 0 0 0 0 0
#> 0000316537 0 0 0 0 0
#> 0000319697 0 0 0 0 0
#> 0000351917 0 0 0 0 0
#> 0000356037 0 1 0 0 0
#> 0000700815 0 0 0 0 0
#> 0000704415 2 0 0 0 0
#> 0000705003 0 0 0 0 0
#> 0000720154 0 0 0 0 0
修復后,您將能夠按照這些思路運行模型。 對於時間固定效應模型:
model <- plm(GCO ~ GCO_prev + lloss + zmijewskiscore, data = pdat.problem, model="within", effect = "time")
或以industry
作為附加固定效應的時間固定效應:
model2 <- plm(GCO ~ GCO_prev + lloss + zmijewskiscore + factor(industry), data = pdat.problem, model="within", effect = "time")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.