简体   繁体   English

具有 LMDB 的内存数据库

[英]In memory databases with LMDB

I have a project which uses BerkelyDB as a key value store for up to hundreds of millions of small records.我有一个项目,它使用 BerkelyDB 作为多达数亿条小记录的键值存储。

The way it's used is all the values are inserted into the database, and then they are iterated over using both sequential and random access, all from a single thread.它的使用方式是将所有值插入到数据库中,然后使用顺序访问和随机访问对它们进行迭代,所有这些都来自单个线程。

With BerkeleyDB, I can create in-memory databases that are "never intended to be preserved on disk".使用 BerkeleyDB,我可以创建“永远不会保留在磁盘上”的内存数据库 If the database is small enough to fit in the BerkeleyDB cache, it will never be written to disk.如果数据库小到可以放入 BerkeleyDB 缓存,则永远不会将其写入磁盘。 If it is bigger than the cache, then a temporary file will be created to hold the overflow.如果它大于缓存,则将创建一个临时文件来保存溢出。 This option can speed things up significantly, as it prevents my application from writing gigabytes of dead data to disk when closing the database.此选项可以显着加快速度,因为它可以防止我的应用程序在关闭数据库时将千兆字节的死数据写入磁盘。

I have found that the BerkeleyDB write performance is too poor, even on an SSD, so I would like to switch to LMDB .我发现 BerkeleyDB 写入性能太差,即使在 SSD 上,所以我想切换到LMDB However, based on the documentation, it doesn't seem like there is an option creating a non-persistent database.但是,根据文档,似乎没有创建非持久性数据库的选项。

What configuration/combination of options should I use to get the best performance out of LMDB if I don't care about persistence or concurrent access at all?如果我根本不关心持久性或并发访问,我应该使用什么配置/选项组合来获得 LMDB 的最佳性能? ie to make it act like an "in-memory database" with temporary backing disk storage?即让它像一个带有临时后备磁盘存储的“内存数据库”?

Just use MDB_NOSYNC and never call mdb_env_sync() yourself.只需使用 MDB_NOSYNC 并且永远不要自己调用mdb_env_sync() You could also use MDB_WRITEMAP in addition.此外,您还可以使用 MDB_WRITEMAP。 The OS will still eventually flush dirty pages to disk;操作系统最终仍会将脏页刷新到磁盘; you can play with /proc/sys/vm/dirty_ratio etc. to control that behavior.您可以使用 /proc/sys/vm/dirty_ratio 等来控制该行为。

From this post: https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/来自这篇文章: https : //lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

vm.dirty_ratio is the absolute maximum amount of system memory that can be filled with dirty pages before everything must get committed to disk. vm.dirty_ratio 是在所有内容都必须提交到磁盘之前可以用脏页填充的系统内存的绝对最大量。 When the system gets to this point all new I/O blocks until dirty pages have been written to disk.当系统到达这一点时,所有新的 I/O 都会阻塞,直到脏页被写入磁盘。

If the dirty ratio is too small, then you will see frequent synchronous disk writes.如果脏比率太小,那么您将看到频繁的同步磁盘写入。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM