简体   繁体   English

ArangoDB:通过图遍历聚合计数

[英]ArangoDB: Aggregating counts via graph traversal

In my ArangoDB graph, I have a subject, message threads associated with that subject, and messages inside those message threads. 在我的ArangoDB图中,我有一个主题,与该主题相关联的消息线程,以及这些消息线程内的消息。 I would like to traverse the graph in such a way that I return the data associated with the message thread as well as the count of messages inside the message thread. 我想以这样一种方式遍历图形,即返回与消息线程关联的数据以及消息线程内的消息计数。

The data is structured fairly simply: I have the subject node, an edge extending to the thread node with the date and category associated, and an edge from the thread node to the message node. 数据的结构非常简单:我有主题节点,边缘扩展到线程节点,日期和类别相关联,以及从线程节点到消息节点的边缘。

I would like to return the data stored in the thread node and the count of messages attached to the thread. 我想返回存储在线程节点中的数据和附加到线程的消息计数。

I'm not sure how to do this with the for v, e, p in 1..2 outbound syntax. 我不知道如何用for v, e, p in 1..2 outbound语法中的for v, e, p in 1..2 outbound来做到这一点。 Should I just do for v, e, p in outbound with a nested graph inside it? 我应该只for v, e, p in outbound做嵌套图吗? Is that still performant? 这仍然是高性能的吗?

Sorry for the delay, we are working hard on 3.1 release ;) 对不起,我们正在努力3.1版;)

I think you are already at the correct solution: It is not easily possible to express what you would like to achieve in a 1..2 OUTBOUND statement. 我认为你已经找到了正确的解决方案:在1..2 OUTBOUND语句中表达你想要达到的目标并不容易。 It is way easier to formulate in two 1..1 OUTBOUND statements. 在两个1..1 OUTBOUND语句中表达更容易。

From your explanation i think the following query is what you would use: 根据您的解释,我认为以下查询是您将使用的:

FOR thread IN 1 OUTBOUND @start @@threadEdges
  LET nr = COUNT(FOR message IN 1 OUTBOUND thread @@messageEdges RETURN 1)
  RETURN {
    date: thread.date,
    category: thread.category,
    messages: nr
  }

For some explanation: i first select the associated thread. 对于一些解释:我首先选择相关的线程。 Next i do a subquery to simply could the messages for one thread. 接下来我做一个子查询,只需要一个线程的消息。 Finally i return the information i need. 最后,我返回了我需要的信息。

In terms of performance: In terms of data access (which is Most likely the "bottleneck" operation) there is no difference in FOR x IN 1..2 OUTBOUND [...] and FOR x IN 1 OUTBOUND [...] FOR y IN 1 OUTBOUND x [...] both have to look at exactly the same documents. 在性能方面:在数据访问方面(最有可能是“瓶颈”操作), FOR x IN 1..2 OUTBOUND [...]FOR x IN 1 OUTBOUND [...] FOR y IN 1 OUTBOUND x [...]没有区别FOR x IN 1 OUTBOUND [...] FOR y IN 1 OUTBOUND x [...]两者都必须查看完全相同的文档。 The query optimization might be a bit slower in the later case, but the difference is way below 1ms . 在后一种情况下,查询优化可能会稍慢,但差异低于1ms

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM