简体繁体 English

Boost r树在内存与映射文件中的性能差异

[英]Performance difference of Boost r-tree in memory vs. mapped file

原文 2015-01-30 08:28:19 2 2 c++/ performance/ ubuntu-14.04/ r-tree/ boost-geometry

I need to create an 3D R*-tree, perhaps for long time storage, but performance will also be an issue. 我需要创建3D R *-树，也许用于长时间存储，但是性能也会成为问题。 In order to create the tree, I decided to use Boost's spacialindex and basically found two possible methods. 为了创建树，我决定使用Boost的spacialindex并基本找到了两种可能的方法。

Either I create it directly using objects as it's here: Index of polygons stored in vector , however that does not allow me to store and load it without creating the R*-tree again. 我要么直接使用对象在此处创建它，要么：在vector中存储的多边形索引，但是这不允许我在不再次创建R * -tree的情况下存储和加载它。

Or I could use a mapped file as explained here: Index stored in mapped file using Boost.Interprocess , however, I am not sure if the performance of queries is good enough in this case. 或者，我可以按如下说明使用映射文件：使用Boost.Interprocess在映射文件中存储索引，但是，在这种情况下，我不确定查询的性能是否足够好。

My r-tree will contain several thousand entries, but most likely less than about 100,000. 我的r树将包含数千个条目，但最有可能少于100,000个。 Now my question is, is there any strong performance issue by using mapped files compared to using the standard objects? 现在我的问题是，与使用标准对象相比，使用映射文件是否存在任何强大的性能问题？ Also, if the creation of an R*-tree of about 100,000 values does not take a substanial amount of time (I could have all bounding boxes and corresponding keys/data stored in a file) then it might be a better option to skip the mapped file and just create the tree every time I run the program? 另外，如果创建一个大约100,000个值的R *树不需要花费大量时间（我可以将所有边界框和相应的键/数据存储在文件中），那么跳过该键可能是一个更好的选择。映射文件并在每次运行程序时都创建树？

Hopefully, somebody can help me here, as the documentation does not really provide much information (though it's still worlds better than the documentation of libspacialindex). 希望有人可以在这里为我提供帮助，因为该文档实际上并没有提供太多信息（尽管仍然比libspacialindex的文档更好）。

2 个解决方案

A mapped file will behave mostly like regular memory (in fact, in Linux, memory allocation with new or malloc will use mmap [with a "no file" backing storage] as the underlying allocation method). 映射文件的行为基本上类似于常规内存（实际上，在Linux中，使用new或malloc内存分配将使用mmap [具有“无文件”后备存储]作为基础分配方法）。 However, if you do many small writes "all over the place", and you are mapping over a REAL FILE, then the OS will restrict the amount of buffered writes before writing to the file. 但是，如果您“在整个地方”进行了许多小写操作，并且正在通过REAL FILE进行映射，则OS将在写文件之前限制缓冲写操作的数量。

I did some experiments when the subject came up a while ago, and by adjusting the settings for how the OS deals with these "pending writes", I got reasonably performance even for filebacked memory mapping with random read/write pattern [something I expect happens when you are building your tree]. 当主题出现前，我做了一些实验，并且通过调整操作系统如何处理这些“挂起的写操作”的设置，即使使用随机读/写模式进行文件支持的内存映射，我也可以获得合理的性能[我期望发生的事情在建造树时]。

Here's the "performance of mmap with random writes" question, which I think is highly related: Bad Linux Memory Mapped File Performance with Random Access C++ & Python (This answer applies to Linux - other OS's, in particular Windows, may well behave completely differently with regards to how it deals with writes to mapped files) 我认为这是“具有随机写入的mmap性能”问题，与以下问题密切相关：具有随机访问C ++和Python的不良Linux内存映射文件性能（此答案适用于Linux-其他操作系统（尤其是Windows）的行为可能完全不同关于如何处理写入映射文件的问题）

Of course, it's pretty hard to say "which is better", between memory mapped file or rebuild every time the program is run - it really depends on what your application does, whether you run it 100 times a second, or once a day, how long it takes to rebuild [I have absolutely no idea!], and lots of other such things. 当然，在每次运行程序时在内存映射文件或重建文件之间很难说“哪个更好” –这实际上取决于您的应用程序执行的操作，是每秒运行100次还是每天运行一次，重建需要多长时间[我绝对不知道！]，以及许多其他这样的事情。 There are two choices: Build the simplest version, and see if it's "fast enough", or build both versions, and measure how much difference there is, and then decide which path to go down. 有两种选择：构建最简单的版本，并查看其是否“足够快”，或者构建两个版本，并测量存在多少差异，然后确定要沿哪条路走。

I tend to build the simple(ish) model, and if performance isn't good enough, figure out where the slowness comes from, and then fix that - it saves spending lots of time making something that takes 0.01% of the total execution time run 5 clock-cycles faster, and ending up with a big thinko somewhere else that makes it run 500 times slower than you expected... 我倾向于建立简单的（ish）模型，如果性能不够好，请找出缓慢的原因，然后加以解决-这样可以节省大量时间，使事情花费了总执行时间的0.01％运行5个时钟周期更快，最后在其他地方产生了沉思，这使其运行速度比您预期的慢500倍...

Bulk-loading the index is much faster than repeated insertion, and yields a much more efficient tree. 批量加载索引比反复插入快得多 ，并产生一种更有效的树。 So if you can hold all your data in main memory, I suggest rebuilding the tree using STR bulk loading. 因此，如果您可以将所有数据保存在主内存中，建议使用STR批量加载来重建树。 In my experience this is more than fast enough (bulk loading time is dwarfed by I/O time). 以我的经验，这已经足够快了（批量加载时间与I / O时间相形见））。

Cost of STR is roughly that of sorting. STR的成本大约是分类的成本。 O(n log n) theoretically, with very low constants (a less efficient implementation may be O(n log n log n) but that still is fairly cheap). 理论上讲， O(n log n)具有非常低的常数（效率较低的实现可能是O(n log n log n)但仍然相当便宜）。