[英]Keeping huge matrix in memory across multiple runs of a C++ program
I'm writing some C++ code (using the Eigen3 matrix library) to solve a convex optimization problem involving a huge sparse matrix. 我正在编写一些C ++代码(使用Eigen3矩阵库)来解决涉及巨大稀疏矩阵的凸优化问题。 It takes a minute or so to read in the matrix from a file, and I don't want to have to read in the matrix from a file every single time I run my program. 从文件中读取矩阵需要一分钟左右的时间,我不想每次运行程序时都必须从文件中读取矩阵。 (I'm going to be tuning the parameters in my optimization algorithm, which involves running my code many times in a row, and I don't want to have to wait one minute to read in the big matrix each time.) (我将在优化算法中调整参数,这涉及到连续运行我的代码多次,并且我不想每次都等待一分钟来读取大矩阵。)
Is there a way that I can keep this big matrix in memory while I change some parameters in my code then recompile my code and run it again? 有没有办法在更改代码中的某些参数然后重新编译代码并再次运行它时将这个大矩阵保留在内存中?
This kind of thing is easy in Matlab, but I don't know how it's handled in C++ (although this is a common situation so there must be a standard approach that people take). 这种事情在Matlab中很容易,但是我不知道它在C ++中是如何处理的(尽管这是一种常见的情况,所以人们必须采取一种标准的方法)。
Is there a way that I can keep this big matrix in memory while I change some parameters in my code then recompile my code and run it again? 有没有办法在更改代码中的某些参数然后重新编译代码并再次运行它时将这个大矩阵保留在内存中?
AFAIK keeping the memory of a process while it is not running, and then "rerun" the process is not supported by any operating system. AFAIK会在进程未运行时保留其内存,然后“重新运行”该进程不受任何操作系统的支持。
You could try to: 您可以尝试:
But most of these will (though fun) be extremely complex to implement. 但是其中大多数(尽管很有趣)实现起来非常复杂。
I'm going to be tuning the parameters in my optimization algorithm, which involves running my code many times in a row, and I don't want to have to wait one minute to read in the big matrix each time. 我将在优化算法中调整参数,这涉及到连续运行我的代码多次,并且我不想每次都等待一分钟来读取大矩阵。
How about getting those parameters from user input instead of hard coding them? 如何从用户输入中获取这些参数,而不是对其进行硬编码? That would allow you to specify the parameters, run your code, read in another set of parameters, do another run, ... without having to recompile your program or stop and restart the process. 这样一来,您无需重新编译程序或停止并重新启动过程即可指定参数,运行代码,读取另一组参数,进行另一次运行...。
Your case is the perfect example for why the mmap()
exists :) 您的案例是mmap()
为什么存在的完美示例:)
mmap() (available on all modern platforms) allows you to treat a file on disk as regular RAM, with "direct" random read/write access and OS-backed paging support (much like what happens to your memory when it is swapped out by OS's memory manager) mmap() (在所有现代平台上都可用)使您可以将磁盘上的文件视为常规RAM,并具有“直接”随机读/写访问权限和操作系统支持的分页支持(就像换出时内存的情况一样)通过操作系统的内存管理器)
Is there a way that I can keep this big matrix in memory while I change some parameters in my code then recompile my code and run it again? 有没有办法在更改代码中的某些参数然后重新编译代码并再次运行它时将这个大矩阵保留在内存中?
Well, yes... But I have a feeling its implementation would be way outside the scope of your project. 好吧,是的。但是我感觉它的实现将超出您项目的范围。 In essence this is what you'd do: 本质上,这就是您要做的:
You can dump the data of your matrix in binary form -- just dump everything pointed to from S.outerIndexPtr()
, S.innerIndexPtr()
, S.valuePtr()
(perhaps write all sizes at the start, if they are not always the same). 您可以以二进制形式转储矩阵的数据-只需转储S.outerIndexPtr()
, S.innerIndexPtr()
, S.valuePtr()
所有内容(如果不总是,可能在开始时写所有大小S.valuePtr()
相同)。
To read it again, just mmap
your file and create a Map<SparseMatrix>
from the correct start addresses. 要读一遍,只是mmap
文件并创建一个Map<SparseMatrix>
从正确的起始地址。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.