简体   繁体   English

为什么在arangodb中排序缓慢?

[英]Why is sorting in arangodb slow?

I am experimenting to see whether arangodb might be suitable for our usecase. 我正在尝试查看arangodb是否适合我们的用例。 We will have large collections of documents with the same schema (like an sql table). 我们将收集具有相同模式(如sql表)的大量文档。

To try some queries I have inserted about 90K documents, which is low, as we expect document counts in the order of 1 million of more. 为了尝试一些查询,我插入了大约9万个文档,这很低,因为我们希望文档数量在100万以上。

Now I want to get a simple page of these documents, without filtering, but with descending sorting. 现在,我想获得这些文档的简单页面,不进行过滤,而是进行降序排序。

So my aql is: 所以我的aql是:

for a in test_collection
sort a.ARTICLE_INTERNALNR desc
limit 0,10
return {'nr': a.ARTICLE_INTERNALNR}

When I run this in the AQL Editor, it takes about 7 seconds, while I would expect a couple of milliseconds or something like that. 当我在AQL编辑器中运行此程序时,大约需要7秒钟,而我期望的时间可能是几毫秒或类似的时间。

I have tried creating a hash index and a skiplist index on it, but that didn't have any effect: 我尝试在其上创建哈希索引和跳过列表索引,但这没有任何效果:

 db.test_collection.getIndexes()
[ 
  { 
    "id" : "test_collection/0", 
    "type" : "primary", 
    "unique" : true, 
    "fields" : [ 
      "_id" 
    ] 
  }, 
  { 
    "id" : "test_collection/19812564965", 
    "type" : "hash", 
    "unique" : true, 
    "fields" : [ 
      "ARTICLE_INTERNALNR" 
    ] 
  }, 
  { 
    "id" : "test_collection/19826720741", 
    "type" : "skiplist", 
    "unique" : false, 
    "fields" : [ 
      "ARTICLE_INTERNALNR" 
    ] 
  } 
]

So, am I missing something, or is ArangoDB not suitable for these cases? 那么,我是否缺少某些东西,还是ArangoDB不适合这些情况?

If ArangoDB needs to sort all the documents, this will be a relatively slow operation (compared to not sorting). 如果ArangoDB需要对所有文档进行排序,这将是一个相对较慢的操作(与不排序相比)。 So the goal is to avoid the sorting at all. 因此,目标是完全避免排序。 ArangoDB has a skiplist index, which keeps indexed values in sorted order, and if that can be used in a query, it will speed up the query. ArangoDB有一个跳过列表索引,该索引将索引值保持在已排序的顺序,并且如果可以在查询中使用它,则将加快查询速度。

There are a few gotchas at the moment: 目前有一些陷阱:

  1. AQL queries without a FILTER condition won't use an index. 没有FILTER条件的AQL查询将不使用索引。
  2. the skiplist index is fine for forward-order traversals, but it has no backward-order traversal facility. skiplist索引适用于前向遍历,但没有后向遍历功能。

Both these issues seem to have affected you. 这两个问题似乎都影响了您。 We hope to fix both issues as soon as possible. 我们希望尽快解决这两个问题。

At the moment there is a workaround to enforce using the index in forward-order using an AQL query as follows: 目前,有一种变通方法可以强制使用AQL查询以向前顺序使用索引,如下所示:

FOR a IN 
  SKIPLIST(test_collection, { ARTICLE_INTERNALNR: [ [ '>', 0 ] ] }, 0, 10) 
RETURN { nr: a.ARTICLE_INTERNALNR }

The above picks up the first 10 documents via the index on ARTICLE_INTERNALNR with a condition "value > 0". 上面的代码通过条件为“值> 0”的ARTICLE_INTERNALNR上的索引选取了前10个文档。 I am not sure if there is a solution for sorting backwards with limit. 我不确定是否有解决方案以限制向后排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM