简体   繁体   English

ArangoDB 在尝试运行带有 COLLECT 子句的遍历 AQL 时崩溃

[英]ArangoDB crashes on attempt to run traversal AQL with COLLECT clause in it

Data model is:数据模型为:

books - documents书籍- 文件

pages - documents.- 文件。 Page may contain a only one references to another book页面可能只包含一个对另一本书的引用

books_pages - edges. book_pages - 边缘。 From book to page and from page to book从书到页,从页到书

Example:例子:

book1 -> (edge) -> page1 -> (edge) -> book2
book1 -> (edge) -> page2 -> (edge) -> book2
book1 -> (edge) -> page3 -> (edge) -> book2
book1 -> (edge) -> page4 -> (edge) -> book3
book2 -> (edge) -> page5 -> (edge) -> book4
book2 -> (edge) -> page6 -> (edge) -> book4
book2 -> (edge) -> page7 -> (edge) -> book4
book2 -> (edge) -> page6 -> (edge) -> book4
...

The goal is to build edges between books avoiding duplication.目标是在书籍之间建立边缘,避免重复。 book1 contains several pages that mention book2, but I need only one edge. book1 包含几页提到 book2,但我只需要一个边缘。 Id doesn't matter how many times book2 was referenced in book1. id 在 book1 中引用 book2 的次数无关紧要。

AQL :质量标准

FOR b1 IN books
    FOR v IN 1..1 OUTBOUND b1 books_pages
       FOR b2 IN 1..1 OUTBOUND v books_pages
       COLLECT  from = b1._id, to = b2._id
  RETURN {'from':from, 'to': to}

When number of documents in a database is significant arangodb crashes.当数据库中的文档数量很大时,arangodb 崩溃。 Is something wrong with this query or this is just a bug on arangodb side?这个查询有什么问题,或者这只是 arangodb 方面的一个错误?

I cannot comment on the crash, not least of all reasons because you don't give any information pertaining to it and how it manifests itself -- if the reason is an out-of-memory kill/restart, you should mention that (check the system logs if the arangodb log is not helpful).我无法对崩溃发表评论,尤其是所有原因,因为您没有提供任何与其相关的信息以及它如何表现出来——如果原因是内存不足杀死/重启,您应该提到(检查如果 arangodb 日志没有帮助,则系统会记录)。

But concerning your Problem: Aren't you interested in all unique Paths of length 3 (in terms of vertices, 2 in terms of edges)?但是关于您的问题:您是否对所有长度为 3(就顶点而言,就边而言为 2)的唯一路径感兴趣? Doesn't that condense to这不是凝聚成

FOR b IN books
   FOR v,e,p IN 2..2 OUTBOUND b GRAPH 'books'
      RETURN DISTINCT {"from": p.vertices[0]._id, "to": p.vertices[2]._id}

It works for a very small sample set.它适用于非常小的样本集。 Maybe this is a bit lighter on the query-planer, executioner?也许这在查询平面、刽子手上稍微轻一点?

向 AQL 添加选项有助于解决问题。

OPTIONS {uniqueEdges: 'path',  uniqueVertices: 'global', bfs: true }")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM