简体   繁体   English

过滤 Arango 查询结果以报告 AQL 中的数组交集

[英]Filtering Arango query results to report an array intersection in AQL

I am experimenting with ArangoDB as a replacement for Postgres which I am currently using.我正在尝试使用 ArangoDB 来替代我目前使用的 Postgres。 In Postgres I have a table, zuffs containing rows bearing the following form在 Postgres 中,我有一个表, zuffs包含具有以下形式的行

 hash       passes     visits   

 123        {1,2,4}   {2,3,4,5}

where both passes and visits are int[] . passesvisits都是int[] In order to establish the intersection of passes and visits` I would write为了建立passes和访问的交集`我会写

SELECT ARRAY
(
 SELECT UNNEST(a1) INTERSECT SELECT UNNEST(a2))
 FROM (SELECT passes AS a1,visits as a2 FROM zuffs where hash = 1) q;

which Postgres obligingly executes to return the result {2,4} . Postgres 乐于执行以返回结果{2,4}

Now suppose I have the collection zuffs in ArangoDB with the following documents现在假设我在zuffs中有 zuffs 集合,其中包含以下documents

{"hash":45,"passes":[1,2,3],"visits":[3,11,17]} {“哈希”:45,“通过”:[1,2,3],“访问”:[3,11,17]}

{"hash":76,"passes":[11,2],"visits":[3,4,17]} {“哈希”:76,“通过”:[11,2],“访问”:[3,4,17]}

{"hash":13,"passes":[11,21],"visits":[13,44,27]} {“哈希”:13,“通过”:[11,21],“访问”:[13,44,27]}

{"hash":7,"passes":[2],"visits":[4,67]} {“哈希”:7,“通过”:[2],“访问”:[4,67]}

It is not clear to me how I would do the following我不清楚我将如何执行以下操作

  1. Establish the, intersection, I1 of passes and visits for the document bearing the hash 45.为带有hash 45 的文档建立passesvisits的交集I1
  2. Take that result and return the hashes for other documents in the same collection that have a non-empty intersection with the the intersection I2 obtained above.获取该结果并返回同一集合中与上面获得的交集I2具有非空交集的其他文档的哈希值。

While there is much I like about ArangoDB I find it unfortunate that it has its own query language instead of just using the required superset of SQL.虽然我很喜欢 ArangoDB,但不幸的是它有自己的查询语言,而不是仅仅使用所需的 SQL 超集。 In this instance I have figured out that I will somehow have to use FOR IN along with FILTER but it is not at all clear to me how.在这种情况下,我发现我将不得不以某种方式将FOR INFILTER一起使用,但我完全不清楚如何使用。

Being a SQL expert, I also fought against the syntax differences of AQL.作为一名 SQL 专家,我也曾与 AQL 的语法差异作过斗争。 However, it really did not end up being that hard to understand, which made "learning" just a function of time and use.然而,它最终并没有变得那么难懂,这使得“学习”只是时间和使用的函数。

I'm sure there are other ways to do this, but here's a quick/dirty example:我敢肯定还有其他方法可以做到这一点,但这是一个快速/肮脏的例子:

  1. Find the document you want to match, and get it's intersection product找到你要匹配的文档,得到它的交集
  2. For each document z in zuffs , calculate the intersection and look for matches对于zuffs中的每个文档z ,计算交集并查找匹配项
  3. Return the hash of matching documents返回匹配文档的hash
LET match = (
    FOR z IN zuffs
        FILTER z.hash == 45
        FOR i IN INTERSECTION(z.passes, z.visits)
            RETURN i
)
FOR z IN zuffs
    LET i = INTERSECTION(z.passes, z.visits)
    FOR m IN match
        FILTER m IN i
            RETURN z.hash

Returns:退货:

[
  45
]

Given your example data set, you will only get back one document with meeting the requirements (45).鉴于您的示例数据集,您将只会返回一份满足要求的文档 (45)。 Adding more documents or modifying one of the other documents to have a common intersection will provide more interesting results.添加更多文档或修改其他文档之一以具有共同交集将提供更有趣的结果。

Things to keep in mind:要记住的事情:

  1. Moving from SQL to AQL makes you shift your thinking from "sets" (of rows) to "loops" (over things).从 SQL 转移到 AQL 会使您的思维从“集合”(行)转移到“循环”(对事物)。
  2. Keep track of the return (or native) types (array vs object/string/etc.), and treat each piece of data accordingly (note the FOR i IN INTERSECTION... in the match ).跟踪返回(或本机)类型(数组与对象/字符串/等),并相应地处理每条数据(注意match中的FOR i IN INTERSECTION... )。
  3. Use the Explain and Profile buttons (or functions) to see how your query is performing.使用ExplainProfile按钮(或功能)查看查询的执行情况。 Think of other ways the result can be accomplished, and try to make it faster!想想其他可以完成结果的方法,并尝试使其更快!

For instance:例如:

LET match = FIRST(
    FOR z IN zuffs
        FILTER z.hash == 45
        RETURN INTERSECTION(z.passes, z.visits)
)
FOR z IN zuffs
    FILTER LENGTH( INTERSECTION(match, INTERSECTION(z.passes, z.visits)) ) > 0
        RETURN z.hash

The match sections of both examples return the same result (an array with a single value of 3 ), but they do it in different ways.两个示例的match部分返回相同的结果(一个值为3的数组),但它们以不同的方式实现。 And instead of doing a FOR m IN match... , I could have used native array functions with a filter.而不是执行FOR m IN match... ,我可以使用带有过滤器的本机数组函数。 In reality the first example is much quicker than the second, and the reason is evident in the "explain" plan.实际上,第一个示例比第二个示例快得多,原因在“解释”计划中显而易见。

I found it immensely helpful become familiar with the high level and function documentation.我发现熟悉高级功能文档非常有帮助。 Those two places will have almost everything you need to succeed with AQL (aside from "big picture" stuff like query tuning, indexing, etc.).这两个地方将包含您成功使用 AQL 所需的几乎所有内容(除了查询调优、索引等“大局”内容之外)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM