My question is, how do I include industry and year fixed effects in plm, when I have multiple firms in same industry in same year? Repex of my data looks like this:
Year Industry CompanyID CEOID CEO.background MBA.CEO CEO.Tenure Female.CEO CEO.age Capex Log.TA Leverage
2005 6 1075 10739 0 0 6.92 0 55 0.08623238 9.199961396 0.330732917
2006 6 1075 10739 0 0 7.92 0 56 0.097455145 9.334559982 0.26575725
2007 6 1075 10739 0 0 8.92 0 56 0.113033772 9.346263914 0.285439531
2008 6 1075 10739 0 0 9.92 0 57 0.108640177 9.327564318 0.322985772
2009 6 1075 5835 0 0 0.67 0 54 0.08526524 9.360491034 0.333880116
2010 6 1075 5835 0 0 1.67 0 55 0.081452292 9.376545673 0.32197511
2005 6 1743 8379 0 0 17.43 0 65 0.236487293 6.693007633 0.021915227
2006 6 1743 26012 0 1 0.91 0 59 0.319264835 6.820455133 0.023157959
2007 6 1743 26012 0 1 1.91 0 58 0.207384938 6.844512984 0.020087012
2008 6 1743 26012 0 1 2.92 0 59 0.130632264 6.890964093 0.017103795
2009 6 1743 26012 0 1 3.92 0 60 0.112029325 6.879662342 0.017283796
2010 6 1743 30801 0 0 1 1 47 0.02804693 6.767971236 0.044755539
2005 7 1004 9249 0 0 9.65 0 53 0.076370794 6.596094672 0.31534354
2006 7 1004 9249 0 0 10.65 0 54 0.114891589 6.886346743 0.327808308
2007 7 1004 9249 0 0 11.65 0 55 0.097727719 6.973199328 0.307086799
2008 7 1004 9249 0 0 12.65 0 56 0.112119583 7.216716829 0.389800369
2009 7 1004 9249 0 0 13.65 0 57 0.086281135 7.228033526 0.331455792
2010 7 1004 9249 0 0 14.65 0 58 0.298922358 7.313914813 0.291147083
CEO.background
, MBA.CEO
, and Female.CEO
are time-invariant dummies for each CEO and industry time-invariant dummy for firm, while rest are time varying firm/CEO attributes.
I would like to run the following fixed effects for industry/year regression code :
plm(Capex ~ CEO.background + MBA.CEO + CEO.Tenure + Female.CEO + CEO.age + Log.TA + Leverage, data=repexcapex, index = (c("Industry", "Year")), model = "within", effect = "twoways")
However, if I have multiple companies in same industry like above data (company ID 1075/1743 both in industry 6), the code gives an error about duplicates.
Error in pdim.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
In addition: Warning messages:
1: In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
[...]
If I kill the first 5 rows and run it with just 1 firm per industry, the code works.
How should I formulate my regression to be able to include both industry and year fixed effects? Is running the code with industry dummies like below equivalent to industry fixed effects:
plm(Capex ~ CEO.background + MBA.CEO + CEO.Tenure + Female.CEO + CEO.age + Log.TA + Leverage + factor(Industries), data=repexcapex, index = (c("Year")), model = "within", effect = "individual")
this is the formatted data:
repexcapex <- read.table(text="
Year,Industry,CompanyID,CEOID,CEO.background,MBA.CEO,CEO.Tenure,Female.CEO,CEO.age,Capex,Log.TA,Leverage
2005,6,1075,10739,0,0,6.92,0,55,0.08623238,9.199961396,0.330732917
2006,6,1075,10739,0,0,7.92,0,56,0.097455145,9.334559982,0.26575725
2007,6,1075,10739,0,0,8.92,0,56,0.113033772,9.346263914,0.285439531
2008,6,1075,10739,0,0,9.92,0,57,0.108640177,9.327564318,0.322985772
2009,6,1075,5835,0,0,0.67,0,54,0.08526524,9.360491034,0.333880116
2010,6,1075,5835,0,0,1.67,0,55,0.081452292,9.376545673,0.32197511
2005,6,1743,8379,0,0,17.43,0,65,0.236487293,6.693007633,0.021915227
2006,6,1743,26012,0,1,0.91,0,59,0.319264835,6.820455133,0.023157959
2007,6,1743,26012,0,1,1.91,0,58,0.207384938,6.844512984,0.020087012
2008,6,1743,26012,0,1,2.92,0,59,0.130632264,6.890964093,0.017103795
2009,6,1743,26012,0,1,3.92,0,60,0.112029325,6.879662342,0.017283796
2010,6,1743,30801,0,0,1,1,47,0.02804693,6.767971236,0.044755539
2005,7,1004,9249,0,0,9.65,0,53,0.076370794,6.596094672,0.31534354
2006,7,1004,9249,0,0,10.65,0,54,0.114891589,6.886346743,0.327808308
2007,7,1004,9249,0,0,11.65,0,55,0.097727719,6.973199328,0.307086799
2008,7,1004,9249,0,0,12.65,0,56,0.112119583,7.216716829,0.389800369
2009,7,1004,9249,0,0,13.65,0,57,0.086281135,7.228033526,0.331455792
2010,7,1004,9249,0,0,14.65,0,58,0.298922358,7.313914813,0.291147083",
sep=",",header=TRUE)
As your dependent variable Capex
seems to be a company-specific measure, likely the unit of observation (= what plm
calls the individual dimension) is company (variable CompanyID
) which is to be specified in the index
argument.
Thus, a basic 2-way model can be estimated by:
plm(Capex ~ CEO.background + MBA.CEO + CEO.Tenure + Female.CEO + CEO.age + Log.TA + Leverage, data=repexcapex, index = (c("CompanyID", "Year")), model = "within", effect = "twoways")
To add industry fixed effects, include +factor(Industry)
in the formula. Likely, this variable will drop out of the estimation as it is correlated with the other fixed effects (it is for the small sample data you provided).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.