简体   繁体   English

在R中后台运行作业

[英]Running jobs in background in R

I am working with a 250 by 250 matrix. 我正在使用250 x 250矩阵。 However, it takes loads and loads of time to compute this. 但是,计算此过程需要花费大量的时间。 It takes like an hour at least. 至少需要一个小时。

Is it possible that I can store this matrix in memory in R, such that everytime I open up R, it is already there. 我是否可以将此矩阵存储在R的内存中,以便每次打开R时,矩阵就已经存在。

Ideally, I would like to know if it is possible to run a job on background in R , so that I dont have to wait an hour to get the matrix out and be able to play around with it. 理想情况下,我想知道是否有可能在R中在后台运行作业,这样我就不必等待一个小时就可以取出矩阵并可以使用它。

1) You can save the workspace of R when closing R. Usually R asks "Save workspace image?" 1)关闭R时,可以保存R的工作空间。通常R会问“保存工作空间图像?”。 when you are closing it. 当您关闭它时。 If you will answer "Yes" it will save the workspace in a file named ".Rdata" and will load it when staring a new R instance. 如果回答“是”,它将把工作区保存在一个名为“ .Rdata”的文件中,并在启动一个新的R实例时加载它。

2) The better option (more safe) is to save the matrix explicitly. 2)更好的选择(更安全)是显式保存矩阵。 There are several options how it can be done. 有几种方法可以完成。 One of the options is to save it as Rdata file: 选项之一是将其另存为Rdata文件:

save(m, file = "matrix.Rdata")

where m is your matrix. 其中m是您的矩阵。

You can load the matrix at any time with 您可以随时通过以下方式加载矩阵

load("matrix.Rdata")

if you are on the same working directory. 如果您在同一工作目录中。

3) There is not such option as background computing for R. But you can open several R instances. 3)没有R的后台计算这样的选项。但是您可以打开多个R实例。 Do computation in one instance, and do something else on other instance. 在一个实例中执行计算,而在另一实例中执行其他操作。

What would help is to output it to a file when you have computed it and then parse that file everytime you open R. Write yourself a computeMatrix() function or script to produce a file with the matrix stored in a sensible format. 有用的是将计算出的文件输出到文件中,然后每次打开R时都对该文件进行解析。编写一个自己的computeMatrix()函数或脚本来生成一个文件,并以合理的格式存储矩阵。 Also write yourself a loadMatrix() function or script to read in that file and load the matrix into memory for use, then call or run loadMatrix everytime you start R and want to use the matrix. 还可以编写一个loadMatrix()函数或脚本以读取该文件并将矩阵加载到内存中供使用,然后在每次启动R并希望使用矩阵时调用或运行loadMatrix。

In terms of running an R job in the background, you can run an R script from the command line with the syntax "R CMD BATCH scriptName" with scriptName replaced by the name of your script. 就在后台运行R作业而言,您可以从命令行运行语法为“ R CMD BATCH scriptName”的R脚本,其中scriptName替换为脚本名称。

It might be better to use the ff package and save the matrix as an ff object. 最好使用ff包并将矩阵另存为ff对象。 This means that the actual matrix will be saved on the disk in an efficient manner, then when you start a new R session you can point to that same file without loading the entire matrix into memory. 这意味着实际的矩阵将以有效的方式保存在磁盘上,然后,当您开始一个新的R会话时,您可以指向同一文件,而无需将整个矩阵加载到内存中。 When you need part of the matrix, only the part you need will be loaded so it will be much quicker. 当您需要矩阵的一部分时,只会加载您需要的部分,因此速度更快。 Even if you need the entire matrix loaded into memory it should load faster than reading a text file. 即使您需要将整个矩阵加载到内存中,其加载速度也要比读取文本文件快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM