R中的固定效果：plm vs lm + factor（）

Question

I'm trying to run a fixed effects regression model in R. I want to control for heterogeneity in variables C and D (neither are a time variable). 我正在尝试在R中运行固定效果回归模型。我想控制变量C和D（都不是时间变量）中的异质性。

I tried the following two approaches: 我尝试了以下两种方法：

1) Use the plm package: Gives me the following error message 1）使用plm软件包：给我以下错误消息

formula = Y ~ A + B + C + D

reg = plm(formula, data= data, index=c('C','D'), method = 'within')

duplicate couples (time-id)Error in pdim.default(index[[1]], index[[2]]) :

I also tried creating first a panel using 我也尝试过先使用创建面板

data_p = pdata.frame(data,index=c('C','D'))

But I have repeated observations in both columns. 但是我在这两栏中都重复了观察。

2) Use factor() and lm: works well 2）使用factor（）和lm：效果很好

formula = Y ~ A + B + factor(C) + factor(D)
reg = lm(formula, data= data)

What is the difference between the two methods? 两种方法有什么区别？ Why is plm not working for me? 为什么plm对我不起作用？ is it because one of the indices should be time? 是因为指标之一应该是时间吗？

Answer 1

That error is saying you have repeated id-time pairs formed by variables C and D. 该错误表示您重复了由变量C和D组成的id-time对。

Let's say you have a third variable F which jointly with C keep individuals distinct from other one (or your first dimension, whatever it is). 假设您有第三个变量F，该变量与C共同使个体与另一个变量（或您的第一个维度，无论大小）有所不同。 Then with dplyr you can create a unique indice, say id : 然后，使用dplyr可以创建一个唯一的索引，例如id ：

data.frame$id <- data.frame %>% group_indices(C, F)

The the index argument in plm becomes index = c(id, D) . plm中的index参数变为index = c(id, D) 。

The lm + factor() is a solution just in case you have distinct observations. lm + factor()是一个解决方案，以防万一您有不同的发现。 If this is not the case, it will not properly weights the result within each id, that is, the fixed effect is not properly identified. 如果不是这种情况，将无法在每个ID中正确加权结果，即无法正确识别固定效果。

R中的固定效果：plm vs lm + factor（）

问题描述

1 个解决方案

解决方案1
3 2016-09-20 12:54:22

R中的固定效果：plm vs lm + factor（）

问题描述

1 个解决方案

解决方案1 3 2016-09-20 12:54:22

解决方案1
3 2016-09-20 12:54:22