简体   繁体   English

使用SpatialIndex库选择R * Tree的参数

[英]choice of parameters for R* Tree using SpatialIndex library

I am using the spatialindex library from http://libspatialindex.github.com/ 我正在使用来自http://libspatialindex.github.com/的spatialindex库

I am creating an R* tree in the main memory: 我在主内存中创建一个R *树:

size_t capacity = 10;
bool bWriteThrough = false;
fileInMem = StorageManager
    ::createNewRandomEvictionsBuffer(*memStorage, capacity, bWriteThrough);

double fillFactor = 0.7;
size_t indexCapacity = 10;
size_t leafCapacity = 10;
size_t dimension = 2;
RTree::RTreeVariant rv = RTree::RV_RSTAR;
tree = RTree::createNewRTree(*fileInMem, fillFactor, indexCapacity,
   leafCapacity, dimension, rv, indexIdentifier);

Then I am inserting a large number of bounding boxes, currently some 2.5M (road network of Bavaria in Germany). 然后我插入了大量的边界框,目前大约2.5M(德国巴伐利亚州的公路网)。 Later I'll aim at inserting all roads of Europe. 后来我的目标是插入欧洲的所有道路。

What are good choice of parameters for the storage manager and rtree? 存储管理器和rtree的参数选择有哪些? Mostly I am using the rtree to find the closest roads to a given query (bbox intersection). 大多数情况下,我使用rtree来查找到给定查询(bbox交叉点)最近的道路。

As your data is static, a good bulk load may work for you. 由于您的数据是静态的,因此良好的批量加载可能对您有用。 The most popular (and a rather simple) bluk load is Sort-Tile-Recursive. 最受欢迎(和相当简单)的bluk负载是Sort-Tile-Recursive。 However, it is somewhat designed around point data. 但是,它有点围绕点数据设计。 As you are inserting spatial objects, it may or may not work as well. 在插入空间对象时,它可能也可能不起作用。

If you are using a bulk load, it will no longer be an R*-tree, but a plain R-tree. 如果使用批量加载,它将不再是R *树,而是普通的R树。

Capacity 10 sounds way too little to me. 容量10听起来太少了我。 You want a much larger fan-out. 你想要一个更大的扇出。 But you'll need to benchmark, this is data set and query dependant what is good. 但是你需要进行基准测试,这是数据集和查询依赖什么是好的。 I'd definitely try 100 or more. 我肯定会尝试100或更多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM