简体   繁体   English

将 R 中并行作业的输出保存到一个文件中

[英]Saving output from parallel jobs in R into one file

I am running a rather lengthy job that I need to replicate 100 times, thus I have turned to the foreach capability in R which I then run on a 8-core cluster through a shell script.我正在运行一个相当冗长的工作,我需要复制 100 次,因此我转向了 R 中的 foreach 功能,然后我通过 shell 脚本在 8 核集群上运行。 I am trying to input all of my results from each run to the same file.我试图将每次运行的所有结果输入到同一个文件中。 I have included a simplified version of my code.我已经包含了我的代码的简化版本。

cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
foreach(p=1:100) %dopar%{

functions defining my variables{...}

  for(i in 1:fMaxInd){
   rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
     sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
     rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
     biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
     Qcbar[,i]<-Qflbar-biasCorrV[,i]
     sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
     ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2

   }   

   sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
   SigEpsilonSq[[p]]<-sigmaEpsSqV
   SigLSq[[p]]<-sigmaExtSq
   RatioMat[[p]]<-ratioMatr 

} #End of the dopar loop

stopCluster(cl)

write.csv(SigEpsilonSq,file="Sigma_Epsilon_Sq.csv")
write.csv(SigLSq,file="Sigma_L_Sq.csv")
write.csv(RatioMat,file="Ratio_Matrix.csv")

When the job completes, my .csv files are empty.作业完成后,我的 .csv 文件为空。 I believe I'm not quite understanding how the foreach saves results and how I can access them.我相信我不太了解 foreach 如何保存结果以及如何访问它们。 I would like to avoid having to merge files manually.我想避免手动合并文件。 Also, do I need to write stopCluster(cl) at the end of my foreach loop or do I wait until the very end?另外,我需要在 foreach 循环结束时编写 stopCluster(cl) 还是等到最后? Any help would be much appreciated.任何帮助将非常感激。

This is not how foreach works.这不是 foreach 的工作方式。 You should look into examples.你应该看看例子。 You need to use .combine, if you want to output something from your parallelized jobs.如果要从并行化作业中输出某些内容,则需要使用 .combine。 Also, instead of this:另外,而不是这个:

sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
SigEpsilonSq[[p]]<-sigmaEpsSqV
SigLSq[[p]]<-sigmaExtSq
RatioMat[[p]]<-ratioMatr 

You have to re-write something like this:你必须重写这样的东西:

list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)

You can also use rbind, cbind, c,... to aggregate the results into one final output.您还可以使用 rbind、cbind、c、... 将结果聚合为一个最终输出。 You can even your own combine function, example:您甚至可以使用自己的组合功能,例如:

.combine=function(x,y)rbindlist(list(x,y))

The solution below should work.下面的解决方案应该有效。 The output should be a list of lists.输出应该是一个列表列表。 However it might be painful to retreive results and save them in the correct format.然而,检索结果并以正确的格式保存它们可能会很痛苦。 If so, you should design your own .combine function.如果是这样,您应该设计自己的 .combine 函数。

cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
results = foreach(p=1:100, .combine=list) %dopar%{

  functions defining my variables{...}

  for(i in 1:fMaxInd){
   rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
     sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
     rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
     biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
     Qcbar[,i]<-Qflbar-biasCorrV[,i]
     sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
     ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2

   }   

   list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)

} #End of the dopar loop

stopCluster(cl)

#Then you extract and save results

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM