简体   繁体   中英

including non linearity in fixed effects model in plm

I am trying to build a fixed effects regression with the plm package in R. I am using country level panel data with year and country fixed effects. My problem concerns 2 explanatory variables. One is an interaction term of two varibels and one is a squared term of one of the variables.

model is basically: y = x1 + x1^2+ x3 + x1*x3+...+xn, with the variables all being in log form

It is central to the model to include the squared term, but when I run the regression it always gets excluded because of "singularities", as x1 and x1^2 are obviously correlated. Meaning the regression works and I get estimates for my variables, just not for x1^2 and x1*x2. How do I circumvent this?

library(plm)
fe_reg<- plm(log(y) ~ log(x1)+log(x2)+log(x2^2)+log(x1*x2)+dummy,
                    data = df,
                    index = c("country", "year"), 
                    model = "within",
             effect = "twoways")
summary(fe_reg)  
  ´´´

#I have tried defining the interaction and squared terms as vectors, which helped with the #interaction term but not the squared term. 

df1.pd<- df1 %>% mutate_at(c('x1'), ~(scale(.) %>% as.vector))
df1.pd<- df1 %>% mutate_at(c('x2'), ~(scale(.) %>% as.vector))
 ´´´
I am pretty new to R, so apologies if this not a very well structured question.

You just found a property of the logarithm function:

log(x^2) = 2 * log(x)

Then, obviously, log(x) is collinear with 2*log(x) and one of the two collinear variables is dropped from the estimation.

So, the model you want to estimate is not estimable by linear regression methods. You might want to take different data transformations than log into account or the original variables.

See also the reproducible example below. Linear dependence can be detected, eg, via function detect.lindep from package plm (see also below). Dropping of coefficients from estimation also hints at collinear columns in the model estimation matrix. At times, linear dependence appears only after data transformations invovled in the estimation functions, see for an example of the within transformation the help page ?detect.lindep in the Example section).

library(plm)
data("Grunfeld")
pGrun <- pdata.frame(Grunfeld)
pGrun$lvalue  <- log(pGrun$value)   # log(x)
pGrun$lvalue2 <- log(pGrun$value^2) # log(x^2) == 2 * log(x)

mod  <- plm(inv ~ lvalue + lvalue2 + capital, data = pGrun, model = "within")
summary(mod)
#> Oneway (individual) effect Within Model
#> 
#> Call:
#> plm(formula = inv ~ lvalue + lvalue2 + capital, data = pGrun, 
#>     model = "within")
#> 
#> Balanced Panel: n = 10, T = 20, N = 200
#> 
#> Residuals:
#>       Min.    1st Qu.     Median    3rd Qu.       Max. 
#> -186.62916  -20.56311   -0.17669   20.66673  300.87714 
#> 
#> Coefficients: (1 dropped because of singularities)
#>          Estimate Std. Error t-value Pr(>|t|)    
#> lvalue  30.979345  17.592730  1.7609  0.07988 .  
#> capital  0.360764   0.020078 17.9678  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Total Sum of Squares:    2244400
#> Residual Sum of Squares: 751290
#> R-Squared:      0.66525
#> Adj. R-Squared: 0.64567
#> F-statistic: 186.81 on 2 and 188 DF, p-value: < 2.22e-16

detect.lindep(mod) # run on the model 
#> [1] "Suspicious column number(s): 1, 2"
#> [1] "Suspicious column name(s):   lvalue, lvalue2"
detect.lindep(pGrun) # run on the data
#> [1] "Suspicious column number(s): 6, 7"
#> [1] "Suspicious column name(s):   lvalue, lvalue2"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM