简体   繁体   English

具有三个指标的固定效应模型,用于在R中使用plm进行样本外预测

[英]Fixed effect model with three indexes for out-of-sample predictions using plm in R

I'm not completely sure if this belongs here or in stats, but I think it is more of a programming question than a statistics question. 我不确定这是否属于统计数据,但我认为它更多的是编程问题,而不是统计问题。 Either way I feel I'm in over my head so here it goes. 无论哪种方式,我都觉得自己不知所措,所以就到这里了。

I have panel data about some flows from origin countries iso_o to destination countries iso_d for several years. 我有一些面板数据,这些数据有iso_o来从原产国iso_o到目的地国iso_d As independent variables I have variables with characteristics of the origin countries, destination countries and variables concerning the relationship between origin and destination country. 作为自变量,我具有具有原产国,目的地国特征的变量以及与原产国和目的地国之间的关系有关的变量。 My data looks something like this: 我的数据如下所示:

set.seed(0)
iso_o <- LETTERS[rep(1:3, each = 3, times = 2)]
iso_d <- LETTERS[rep(1:3, times = 6)]
year <- rep(1990:1991, each = 9, times = 1)
relation <- runif(18, 0, 10)
x1_o <- runif(18, 0, 10)
x2_o <- runif(18, 0, 10)
x1_d <- runif(18, 0, 10)
x2_d <- runif(18, 0, 10)
flow <- rnorm(18, 10, 3)

df <- data.frame(iso_o, iso_d, year, relation, x1_o, x2_o, x1_d, x2_d, flow)

df <- df %>%
    mutate(x1_o = if_else(iso_d == iso_o, x1_d, x1_o),
           x2_o = if_else(iso_d == iso_o, x2_d, x1_o),
           relation = if_else(iso_d == iso_o, 0, relation))

Please ignore the inconsistencies in the data above, it is just an example. 请忽略上面数据中的不一致之处,这只是一个示例。

In reality, I have the independent variable for many more countries and I want to use them to predict the flows between these countries based on my sample. 实际上,我拥有更多国家的自变量,我想用它们根据我的样本预测这些国家之间的流量。 The years in my desired prediction are the same as in my sample. 我期望的预测年份与样本中的年份相同。 For his I want to use a fixed effects model with the plm function. 对于他,我想使用带有plm功能的固定效果模型。 The problem is that this function only allows for one "individual" index variable, where i have two. 问题在于该函数仅允许一个“个体”索引变量,其中我有两个。 I can, of course, combine the iso_o and iso_d columns to create one individual index variable but I want to keep the fixed effects of the sending and receiving country separate. 当然,我可以结合使用iso_oiso_d列来创建一个单独的索引变量,但是我想将发送国和接收国的固定影响分开。

How can I run this fixed effect regression? 如何运行固定效果回归? And is it possible to do the out-of-sample prediction I want or am I missing something? 是否可以进行我想要的样本外预测,或者我错过了什么? Thanks. 谢谢。

Try this (as far as coding goes you can trust this solution but I wouldn't know the differences between different plm models so your question may better be addressed at corssvalidated): 尝试一下(就编码而言,您可以信任此解决方案,但我不知道不同的plm模型之间的区别,因此您最好在corssvalidated上解决您的问题):

df <- transform(df, id=match(paste(df$iso_o,df$iso_d,sep="_"), unique(paste(df$iso_o,df$iso_d,sep="_")))) #create a column called id which assigns a unique id to the unique combinations of origin and destination countries
library(plm)
model <- plm(flow ~ retention+x1_o+x2_o+x1_d+x2_d,
      data = df, index = c("id","year")) #set up your plm model 

summary(model)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM