[英]R-How do I apply a for loop across multiple data frames?
在 R 中,我有 3 个数据框,它们类似于我在下面提供的示例版本。 第一个Data
是主要数据集, TW
和UW
数据帧具有与Data
类似的变量( MN-mapping_for_N
),然后每个变量( N48
)有 1000 个不同的值。我在这里提供了 3 个。
Data<-matrix(c(4720,44.29,"Work or Private Clinic",N48,2659,55.05,"Hospital",N1,1612,59.99,"No Care",N48),ncol = 4,byrow=TRUE)
colnames(Data)<-c("studyid", "Pred_ex", "wherecare", "MN-mapping_for_N")
Data<-data.frame(Data)
TW<-matrix(c("N48",0.07,0.08,0.09,"N1",0.10,0.11,0.12,"N2",0.02,0.03,0.04,"N3",0.04,0.05,0.06),ncol = 4, byrow = TRUE)
colnames(TW)<-c("MN-mapping_for_N","draw1","draw2","draw3")`
TW<-data.frame(TW)
UW<-matrix(c("N48",0.71,0.81,0.91,"N1",0.11,0.111,0.131,"N2",0.021,0.031,0.041,"N3",0.041,0.051,0.061),ncol = 4, byrow = TRUE)
colnames(UW)<-c("MN-mapping_for_N","draw1","draw2","draw3")`
UW<-data.frame(UW)
我的目标是创建一个新列,其值来自UT
和TW
数据中随机选择的列,要从中提取的正确列取决于Data$wherecare
的值
我一直在使用 dplyr 和 match 函数的组合以及我自己创建的几个函数。 目前这看起来像
drawselect<-function(x) {
samplepick<-sample(2:1001,1)
select(x,1,num_range("draw",samplepick))
}
DALY_FX_LT_NR<-function(x){
draw_T_DW<-drawselect(TW)
draw_UT_DW<-drawselect(UW)
drawnames.TW<-colnames((draw_T_DW))
drawnames.UT<-colnames(draw_UT_DW)
UT.draw<-drawnames.UT[2]
T.draw<-drawnames.T[2]
print(UT.draw)
print(T.draw)
newdf<-x %>% mutate(DW=NA)
for(i in 1:nrow(newdf)){
if(newdf$wherecare[i]!= "No Care"){
newdf$DW=draw_T_DW[,2][match(newdf$`MN-mapping_for_N`,draw_T_DW$`MN-mapping_for_N`)]
next
}else if(newdf$wherecare[i]=="No Care"){
newdf$DW=draw_UT_DW[,2][match(newdf$`MN-mapping_for_N`,draw_UT_DW$`MN-mapping_for_N`[i])]
}
}
newdf
}
代码运行,但我似乎无法让它实际逐行迭代以使其从正确的数据帧中提取绘制值(即在通过drawselect
函数后的UT
或TW
)。
所以我得到的看起来像:
-------------------------------------------------------------
studyid Pred_ex wherecare MN-mapping_for_N DW
--------- --------- ---------------------- ------------------ ------
4720 44.29 Work or Private Clinic N48 0.08
2659 55.05 Hospital N1 0.11
1612 59.99 No Care N48 0.08
--------------------------------------------------------------------
当我应该得到:
studyid Pred_ex wherecare MN-mapping_for_N DW
--------- --------- ---------------------- ------------------ ------
4720 44.29 Work or Private Clinic N48 0.08
2659 55.05 Hospital N1 0.11
1612 59.99 No Care N48 0.81
--------------------------------------------------------------------
关键的区别是右下角的 0.81,在样本数据中没什么大不了的,但实际数据有几百行长,所以我想让函数“正确决定”从哪个数据集提取。 该值可以是 0.71、0.81 或 0.91, N48
任何UT
值都可以使用。
最终目标是通过乘以Pred_ex
列在计算中使用该值,我可以这样做,然后多次重新运行此函数以引导数据,但直到我能让这些if
语句正常工作我不能去做。 我也尝试使用dplyr::left_join
来匹配这些,并且在条件语句不起作用时遇到了类似的问题。 我认为按照其编写的match
函数会更好地工作,但我当然对任何事情都持开放态度。
任何帮助是极大的赞赏。
此外,感谢堆栈溢出的每个人,阅读您对其他问题的回答是我走到这一步的主要原因。
所以你不需要一个新函数(我保留了drawselect
,你可以执行以下操作:
for (i in 1:nrow(Data)){
if (Data$wherecare[i] != "No Care"){
Data$DW[i]<- drawselect(TW)[which(drawselect(TW)$MN.mapping_for_N == as.character(Data$MN.mapping_for_N[i])), 2]
} else {
Data$DW[i]<- drawselect(UW)[which(drawselect(UW)$MN.mapping_for_N == as.character(Data$MN.mapping_for_N[i])), 2]
}
}
> Data
studyid Pred_ex wherecare MN.mapping_for_N DW
1 4720 44.29 Work or Private Clinic N48 0.08
2 2659 55.05 Hospital N1 0.11
3 1612 59.99 No Care N48 0.81
如果您想将所有内容包装在一个函数中(包括drawselect
),请尝试以下方法:
DALY_FX_LT_NR<-function(x, y, z){ #x would be Data, y would be TW, z would be UW
samplepick<-sample(2:(ncol(y)-1),1)
for (i in 1:nrow(x)){
if (x$wherecare[i] != "No Care"){
x$DW[i]<- y[which(y$MN.mapping_for_N==as.character(x$MN.mapping_for_N[i])), paste0("draw", samplepick)]
} else {
x$DW[i]<- z[which(z$MN.mapping_for_N==as.character(x$MN.mapping_for_N[i])), paste0("draw", samplepick)]
}
}
return(x)
}
> DALY_FX_LT_NR(x = Data, y = TW, z = UW)
studyid Pred_ex wherecare MN.mapping_for_N DW
1 4720 44.29 Work or Private Clinic N48 0.09
2 2659 55.05 Hospital N1 0.12
3 1612 59.99 No Care N48 0.91
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.