简体   繁体   English

内存映射数据库

[英]Memory mapped database

I have 8 terabytes of data composed of ~5000 arrays of small sized elements (under a hundred bytes per element). 我有8 TB的数据,由约5000个小型元素的数组组成(每个元素不足100个字节)。 I need to load sections of these arrays (a few dozen megs at a time) into memory to use in an algorithm as quickly as possible. 我需要将这些数组的部分(一次几十个)加载到内存中,以便尽快在算法中使用。 Are memory mapped files right for this use, and if not what else should I use? 内存映射文件是否适合这种用途,如果不是,我还应该使用其他什么?

Given your requirements I would definitely go with memory mapped files. 根据您的要求,我绝对会使用内存映射文件。 It's almost exactly what they were made for. 这几乎就是它们的用途。 And since memory mapped files consume few physical resources, your extremely large files will have little impact on the system as compared to other methods, especially since smaller views can be mapped into the address space just before performing I/O (eg, those arrays of elements). 而且,由于内存映射文件消耗的物理资源很少,因此与其他方法相比,您的超大文件对系统的影响很小,尤其是因为可以在执行I / O之前将较小的视图映射到地址空间(例如,元素)。 The other big benefit is they give you the simplest working environment possible. 另一个很大的好处是它们为您提供了最简单的工作环境。 You can (mostly) just view your data as a large memory address space and let Windows worry about the I/O. 您可以(大部分)仅将数据视为较大的内存地址空间,而让Windows担心I / O。 Obviously, you'll need to build in locking mechanisms to handle multiple threads, but I'm sure you know that. 显然,您需要构建锁定机制来处理多个线程,但是我敢肯定您知道这一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM