![](/img/trans.png)
[英]How to add NAs for missing combinations of variables to a long-format data.frame in R
[英]How to include missing NAs to R data.frame of unbalanced panel data set?
我的面板數據不平衡,需要包括所有缺失的觀察值。 例如,我有這樣的事情:
YEAR VAR
FIRM.1 YEAR.1 x.1
FIRM.1 YEAR.3 x.2
FIRM.2 YEAR.2 x.3
FIRM.2 YEAR.3 x.4
我想添加缺少的NA:
YEAR VAR
FIRM.1 YEAR.1 x.1
FIRM.1 YEAR.2 NA
FIRM.1 YEAR.3 x.2
FIRM.2 YEAR.1 NA
FIRM.2 YEAR.2 x.3
FIRM.2 YEAR.3 x.4
如何最方便地做到這一點?
我將使用expand.grid
和merge
。
假設您的數據如下:
mydf <- structure(list(FIRM = c("FIRM.1", "FIRM.1", "FIRM.2", "FIRM.2"),
YEAR = c("YEAR.1", "YEAR.3", "YEAR.2", "YEAR.3"), VAR = c("x.1", "x.2",
"x.3", "x.4")), .Names = c("FIRM", "YEAR", "VAR"),
class = "data.frame", row.names = c(NA, -4L))
mydf
# FIRM YEAR VAR
# 1 FIRM.1 YEAR.1 x.1
# 2 FIRM.1 YEAR.3 x.2
# 3 FIRM.2 YEAR.2 x.3
# 4 FIRM.2 YEAR.3 x.4
使用expand.grid
創建“ FIRM”和“ YEAR”數據的“完整”集,然后merge
。
merge(mydf, expand.grid(FIRM = unique(mydf$FIRM),
YEAR = unique(mydf$YEAR)),
all.y = TRUE)
# FIRM YEAR VAR
# 1 FIRM.1 YEAR.1 x.1
# 2 FIRM.1 YEAR.2 <NA>
# 3 FIRM.1 YEAR.3 x.2
# 4 FIRM.2 YEAR.1 <NA>
# 5 FIRM.2 YEAR.2 x.3
# 6 FIRM.2 YEAR.3 x.4
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.