简体   繁体   中英

Fixed effect model with three indexes for out-of-sample predictions using plm in R

I'm not completely sure if this belongs here or in stats, but I think it is more of a programming question than a statistics question. Either way I feel I'm in over my head so here it goes.

I have panel data about some flows from origin countries iso_o to destination countries iso_d for several years. As independent variables I have variables with characteristics of the origin countries, destination countries and variables concerning the relationship between origin and destination country. My data looks something like this:

set.seed(0)
iso_o <- LETTERS[rep(1:3, each = 3, times = 2)]
iso_d <- LETTERS[rep(1:3, times = 6)]
year <- rep(1990:1991, each = 9, times = 1)
relation <- runif(18, 0, 10)
x1_o <- runif(18, 0, 10)
x2_o <- runif(18, 0, 10)
x1_d <- runif(18, 0, 10)
x2_d <- runif(18, 0, 10)
flow <- rnorm(18, 10, 3)

df <- data.frame(iso_o, iso_d, year, relation, x1_o, x2_o, x1_d, x2_d, flow)

df <- df %>%
    mutate(x1_o = if_else(iso_d == iso_o, x1_d, x1_o),
           x2_o = if_else(iso_d == iso_o, x2_d, x1_o),
           relation = if_else(iso_d == iso_o, 0, relation))

Please ignore the inconsistencies in the data above, it is just an example.

In reality, I have the independent variable for many more countries and I want to use them to predict the flows between these countries based on my sample. The years in my desired prediction are the same as in my sample. For his I want to use a fixed effects model with the plm function. The problem is that this function only allows for one "individual" index variable, where i have two. I can, of course, combine the iso_o and iso_d columns to create one individual index variable but I want to keep the fixed effects of the sending and receiving country separate.

How can I run this fixed effect regression? And is it possible to do the out-of-sample prediction I want or am I missing something? Thanks.

Try this (as far as coding goes you can trust this solution but I wouldn't know the differences between different plm models so your question may better be addressed at corssvalidated):

df <- transform(df, id=match(paste(df$iso_o,df$iso_d,sep="_"), unique(paste(df$iso_o,df$iso_d,sep="_")))) #create a column called id which assigns a unique id to the unique combinations of origin and destination countries
library(plm)
model <- plm(flow ~ retention+x1_o+x2_o+x1_d+x2_d,
      data = df, index = c("id","year")) #set up your plm model 

summary(model)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM