简体繁体 English

如何创建内存映射的C ++对象列表

[英]How to create a memory-mapped list of C++ objects

原文 2011-11-17 04:42:21 7 4 c++/ memory

It's been a while since I have done any C++, and I'm a little rusty on the best way to implement the following: 我做了任何C ++已经有一段时间了，而且我对实现以下内容的最佳方法有点生疏：

We have a database which stores large amounts of "objects". 我们有一个存储大量“对象”的数据库。 I am trying to think of a way to load an entire list of these objects into memory, but because of the size and number of these objects, it would be impractical to actually have them all in memory. 我试图想办法将这些对象的整个列表加载到内存中，但由于这些对象的大小和数量，将它们全部存储在内存中是不切实际的。 Rather, I would like to have a "memory mapped file-like" system where the objects are loaded on demand when they are accessed. 相反，我希望有一个“内存映射文件”系统，在访问它们时按需加载对象。 In other words, let the OS or something similar manage which objects should be in memory, similar to how the OS decides which segments of a file should be paged into memory. 换句话说，让OS或类似的东西管理哪些对象应该在内存中，类似于操作系统决定应该将文件的哪些段分页到内存中。 Can anyone give .me a hint on how this could be done? 谁能给.me暗示如何做到这一点？

4 个解决方案

If you're rusty on C++, you might take a simple approach. 如果你在C ++上生气，你可能会采取一种简单的方法。

You mention "Objects"; 你提到“物体”; I take this as "user data" not as actual serialized c++ classes. 我把它当作“用户数据”而不是实际的序列化c ++类。

Anyhow, A memory mapped file is just a file. 无论如何，内存映射文件只是一个文件。 You're going to be reading from the file, the OS isn't going to solve your problems for you. 你将要从文件中读取，操作系统不会为你解决问题。

My advice, is to keep it simple. 我的建议是保持简单。 Implement your "objects" as a normal file I/O. 将“对象”实现为普通文件I / O. Then once you have that working, you can bump up the perfomance by using a memory-mapped file instead. 然后，一旦你有了这个工作，你可以通过使用内存映射文件来提高性能。

As for design patterns, I would design an CObject class that gets created, by the CDataBase class. 至于设计模式，我会设计一个由CDataBase类创建的CObject类。 The CDataBase would know where every Object in the file (Database) is, then would create CObjects as is needed (reading them from the file). CDataBase会知道文件（Database）中每个Object的位置，然后根据需要创建CObject（从文件中读取它们）。

Good luck. 祝好运。

Just a warning - if you have a large number of objects stored in SQL tables that you want to arbitrarily load into memory, it will likely be slow in multiple ways: many hits to the database (try use a minimum number of queries), too many constructor calls (use memory pools), etc... 只是一个警告 - 如果你想在任意加载到内存中的SQL表中存储大量对象，它可能会以多种方式变慢：许多命中数据库（尝试使用最少数量的查询）许多构造函数调用（使用内存池）等...

...but you need to take it a step at a time - see if you can read an record from SQL into an instantiated object first. ...但是你需要一步一步 - 看看你是否可以首先从SQL读取记录到实例化的对象。 The best speed optimization will be in how you organize your data to minimize hits to the DB and to minimize constructor calls. 最佳速度优化将在于如何组织数据以最大限度地减少对数据库的命中并最大限度地减少构造函数调用。

Note that a memory-mapped solution would be in lieu of a SQL table - it will be faster, but less flexible than SQL and you will have the trouble of double-maintenance: data in SQL must be kept in-sync with your memory-mapped file. 请注意，内存映射解决方案将代替SQL表 - 它将更快，但不如SQL灵活，您将遇到双重维护的麻烦：SQL中的数据必须与您的内存保持同步 - 映射文件。

查看Boost内存映射文件。

So depending on the size of your objects you could store them in a Hash map which keeps a LRU list and starts evicting objects to a file while keeping the key and a file offset in the map. 因此，根据对象的大小，您可以将它们存储在哈希映射中，该映射保留LRU列表并开始将对象驱逐到文件，同时在地图中保持键和文件偏移。 That way when you do need to pull them back from disk it is one seek and a read to get the object back. 这样，当你确实需要从磁盘中取回它时，它是一个搜索和读取以获取对象。 If you then want to go to preallocating files of a fixed size you could mmap them and your offset would become just another pointer. 如果你想要预先分配固定大小的文件，你可以mmap它们，你的偏移将成为另一个指针。

This is a simplification of how the Riak database's initial storage worked, they have some design docs on it over on their website [1][2]. 这简化了Riak数据库初始存储的工作方式，他们在网站上有一些设计文档[1] [2]。 This only works really well if your objects are larger than the keys such that all the keys fit in memory easily but the objects don't. 这只有在你的对象比键大时才能很好地工作，这样所有的键都很容易适合内存，但对象则不然。

The Cassandra database uses a similar technique with its "key cache" [3]. Cassandra数据库使用与“密钥缓存”类似的技术[3]。

You could also look to something like Berkly DB for your local store 您还可以在本地商店寻找类似Berkly DB的东西

[1] : http://wiki.basho.com/Concepts.html#Data-Storage [1]： http ： //wiki.basho.com/Concepts.html#Data-Storage
[2] : http://downloads.basho.com/papers/bitcask-intro.pdf [2]： http ： //downloads.basho.com/papers/bitcask-intro.pdf
[3] : http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra [3]： http ： //www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra