[英]Adapt layered R data frame so that values of variables match in rows (based on group and date)
I want to research group A's effect on B regarding certain dependent variables I dubbed "target_n"
. 我想研究A组对B的影响,因为我将其称为
"target_n"
。 Due to the way in which the data was generated I have "layers" of information in my dataset that are ordered by group. 由于生成数据的方式,我在数据集中具有按组排序的“信息层”。 That means, in rows for which
Group=="B"
I have information on B's values on "target_n"
and for rows where Group=="A"
, I have information on the A's values on "X_n"
. 这意味着,在
Group=="B"
行中,我有关于B在"target_n"
上的值的信息;在Group=="A"
行中,我中有A在"X_n"
上的值的信息。 Group "C" is basically a "other"-category but I would need to have them in the same row as A and B as well to make sure that A's effects are on B and not on C. The following should add some clarity: “ C”组基本上是“其他”类别,但我还需要将它们与A和B放在同一行中,以确保A对B而不是对C的影响。以下内容应增加一些清晰度:
My data ( df
) are structured like this: 我的数据(
df
)的结构如下:
df<-data.frame(
"Date"=c(1990-03,2000-01,2010-09,1990-03,2000-01,2010-09,1990-03,2000-01,2010-09),
"Group"=c("A","A","A","B","B","B","C","C","C"),
"X_1_A"=c(9,4,7,NA,NA,NA,NA,NA,NA),
"X_2_A"=c(1,2,6,NA,NA,NA,NA,NA,NA),
"target_1_B"=c(NA,NA,NA,0,2,9,NA,NA,NA),
"target_2_B"=c(NA,NA,NA,9,2,1,NA,NA,NA),
"target_1_C"=c(NA,NA,NA,NA,NA,NA,5,3,1),
"target_2_C"=c(NA,NA,NA,NA,NA,NA,1,9,2)
)
What I want is to compute new variables both for group "A"
and group "C"
so that everything falls within the same rows. 我想要的是为组
"A"
和组"C"
计算新变量 ,以便所有内容都落在同一行中。 If I were to do that manually,I would take A's column "X_1" score at date "1990-03" and assign it to B's place in A's column for the same date. 如果要手动执行此操作,我将在日期“ 1990-03”获取A的“ X_1”列得分,并将其分配给B在相同日期的A列中的位置。
So in the end, my data would look like this: 所以最后,我的数据如下所示:
df<-data.frame(
"Date"=c(1990,2000,2010,1990,2000,2010,1990,2000,2010),
"Group"=c("A","A","A","B","B","B","C","C","C"),
"X_1_A"=c(9,4,7,NA,NA,NA,NA,NA,NA),
"X_2_A"=c(1,2,6,NA,NA,NA,NA,NA,NA),
"target_1_B"=c(NA,NA,NA,0,2,9,NA,NA,NA),
"target_2_B"=c(NA,NA,NA,9,2,1,NA,NA,NA),
"target_1_C"=c(NA,NA,NA,NA,NA,NA,5,3,1),
"target_2_C"=c(NA,NA,NA,NA,NA,NA,1,9,2),
"NEW_X_1_A"=c(NA,NA,NA,9,4,7,NA,NA,NA),
"NEW_X_2_A"=c(NA,NA,NA,1,2,6,NA,NA,NA),
"NEW_target_1_C"=c(NA,NA,NA,5,3,1,NA,NA,NA),
"NEW_target_2_C"=c(NA,NA,NA,1,9,2,NA,NA,NA)
)
(I have a number of these "X_"
s and exactly the same number of "target_"
variables. I also do not just have this group of A, B and C, but A1,A2,A3,C1,C2,C3 and even more Bs. For each set of A1,B1,C1 I also have a "set" of dates that does not match anothers "set". But that would be less of a problem as I could simply slice my dataset horizontally into sets, do the trick for all of them separately and merge them again.) (我有许多这样的
"X_"
和完全相同数量的"target_"
变量。我不仅有这组A,B和C,而且还有A1,A2,A3,C1,C2,C3和对于每个A1,B1,C1集合,我还有一个“集合”日期与另一个“集合”不匹配。但这不会有太大问题,因为我可以简单地将数据集水平分割成集合,分别为所有这些人完成技巧,然后再次合并。)
But how would I bring A's and C's values into B's rows based on Group=="B"
and based on date
? 但是,如何基于
Group=="B"
和基于date
将A和C的值带入B的行中?
Using data.table
you can try 使用
data.table
您可以尝试
df<-data.frame(
"Date"=c("1990-03","2000-01","2010-09","1990-03","2000-01","2010-09","1990-03","2000-01","2010-09"),
"Group"=c("A","A","A","B","B","B","C","C","C"),
"X1_A"=c(9,4,7,NA,NA,NA,NA,NA,NA),
"X2_A"=c(1,2,6,NA,NA,NA,NA,NA,NA),
"target_value_1_B"=c(NA,NA,NA,0,2,9,NA,NA,NA),
"target_value_2_B"=c(NA,NA,NA,9,2,1,NA,NA,NA),
"target_value_1_C"=c(NA,NA,NA,NA,NA,NA,5,3,1),
"target_value_2_C"=c(NA,NA,NA,NA,NA,NA,1,9,2)
)
library(data.table)
setDT(df)[,`:=` (NEW_X1 = ifelse(Group=="B",X1_A[Group=="A"],NA),
NEW_X2 = ifelse(Group=="B",X2_A[Group=="A"],NA),
NEW_target_value_1_C =ifelse(Group=="B",target_value_1_C[Group=="C"],NA),
NEW_target_value_2_C =ifelse(Group=="B",target_value_2_C[Group=="C"],NA)
)]
Which results in: 结果是:
df
Date Group X1_A X2_A target_value_1_B target_value_2_B target_value_1_C target_value_2_C NEW_X1 NEW_X2 NEW_target_value_1_C NEW_target_value_2_C
1: 1990-03 A 9 1 NA NA NA NA NA NA NA NA
2: 2000-01 A 4 2 NA NA NA NA NA NA NA NA
3: 2010-09 A 7 6 NA NA NA NA NA NA NA NA
4: 1990-03 B NA NA 0 9 NA NA 9 1 5 1
5: 2000-01 B NA NA 2 2 NA NA 4 2 3 9
6: 2010-09 B NA NA 9 1 NA NA 7 6 1 2
7: 1990-03 C NA NA NA NA 5 1 NA NA NA NA
8: 2000-01 C NA NA NA NA 3 9 NA NA NA NA
9: 2010-09 C NA NA NA NA 1 2 NA NA NA NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.