ArangoDB 在尝试运行带有 COLLECT 子句的遍历 AQL 时崩溃

Question

Data model is:数据模型为：

books - documents书籍- 文件

pages - documents.页- 文件。 Page may contain a only one references to another book页面可能只包含一个对另一本书的引用

books_pages - edges. book_pages - 边缘。 From book to page and from page to book从书到页，从页到书

Example:例子：

book1 -> (edge) -> page1 -> (edge) -> book2
book1 -> (edge) -> page2 -> (edge) -> book2
book1 -> (edge) -> page3 -> (edge) -> book2
book1 -> (edge) -> page4 -> (edge) -> book3
book2 -> (edge) -> page5 -> (edge) -> book4
book2 -> (edge) -> page6 -> (edge) -> book4
book2 -> (edge) -> page7 -> (edge) -> book4
book2 -> (edge) -> page6 -> (edge) -> book4
...

The goal is to build edges between books avoiding duplication.目标是在书籍之间建立边缘，避免重复。 book1 contains several pages that mention book2, but I need only one edge. book1 包含几页提到 book2，但我只需要一个边缘。 Id doesn't matter how many times book2 was referenced in book1. id 在 book1 中引用 book2 的次数无关紧要。

AQL :质量标准：

FOR b1 IN books
    FOR v IN 1..1 OUTBOUND b1 books_pages
       FOR b2 IN 1..1 OUTBOUND v books_pages
       COLLECT  from = b1._id, to = b2._id
  RETURN {'from':from, 'to': to}

When number of documents in a database is significant arangodb crashes.当数据库中的文档数量很大时，arangodb 崩溃。 Is something wrong with this query or this is just a bug on arangodb side?这个查询有什么问题，或者这只是 arangodb 方面的一个错误？

Answer 1

I cannot comment on the crash, not least of all reasons because you don't give any information pertaining to it and how it manifests itself -- if the reason is an out-of-memory kill/restart, you should mention that (check the system logs if the arangodb log is not helpful).我无法对崩溃发表评论，尤其是所有原因，因为您没有提供任何与其相关的信息以及它如何表现出来——如果原因是内存不足杀死/重启，您应该提到（检查如果 arangodb 日志没有帮助，则系统会记录）。

But concerning your Problem: Aren't you interested in all unique Paths of length 3 (in terms of vertices, 2 in terms of edges)?但是关于您的问题：您是否对所有长度为 3（就顶点而言，就边而言为 2）的唯一路径感兴趣？ Doesn't that condense to这不是凝聚成

FOR b IN books
   FOR v,e,p IN 2..2 OUTBOUND b GRAPH 'books'
      RETURN DISTINCT {"from": p.vertices[0]._id, "to": p.vertices[2]._id}

It works for a very small sample set.它适用于非常小的样本集。 Maybe this is a bit lighter on the query-planer, executioner?也许这在查询平面、刽子手上稍微轻一点？

Answer 2

向 AQL 添加选项有助于解决问题。

OPTIONS {uniqueEdges: 'path',  uniqueVertices: 'global', bfs: true }")

ArangoDB 在尝试运行带有 COLLECT 子句的遍历 AQL 时崩溃

问题描述

2 个解决方案

解决方案1
0 2020-02-20 12:30:28

解决方案2
0 2020-05-26 18:11:08

ArangoDB 在尝试运行带有 COLLECT 子句的遍历 AQL 时崩溃

问题描述

2 个解决方案

解决方案1 0 2020-02-20 12:30:28

解决方案2 0 2020-05-26 18:11:08

解决方案1
0 2020-02-20 12:30:28

解决方案2
0 2020-05-26 18:11:08