简体   繁体   English

将filterVertice UDF从ArangoDB 2.8迁移到ArangoDB 3

[英]Migrating a filterVertice UDF from ArangoDB 2.8 to ArangoDB 3

I am currently migrating a TRAVERSAL function from arangoDB 2 to arangoDB 3. The aql had a custom leaf visitor and a filterVertices option with a custom AQL function (for more specific filtering). 我当前正在将TRAVERSAL函数从arangoDB 2迁移到arangoDB3。aql有一个自定义叶子访问者和一个带有自定义AQL函数的filterVertices选项(用于更具体的过滤)。

FOR result IN TRAVERSAL(
    page, 
    menu, 
    "page/99999999999999",
    "inbound",
    {filterVertices : "udf::customFilter", visitor : "udf::customVisitor", }
 ) RETURN result

The leaf visitor UDF was relatively easy to transfer since it just creates a custom object, but I am having trouble with the filterVertices UDF since in arango 3 the graph functions have been removed. 叶子访问者UDF相对容易转移,因为它仅创建了一个自定义对象,但是我在filterVertices UDF上遇到了麻烦,因为在arango 3中已删除了图形函数。

There are a few cases like the one below in the filterVertices UDF filterVertices UDF中有以下几种情况

    //check the page status
    if (mismatch == 1) {
        //stop traversal and not return mismatched
        return ['exclude', 'prune'];
    } else if (mismatch == 2) {
        //stop but return mismatched
        return 'prune';
    } else {
        //exclude mismatched but continue
        return 'exclude';
    }

My question is how should the prune and exclude be translated in FILTER cases in the aql below exactly ? 我的问题是,在以下aql中的FILTER情况下,应如何准确修剪修剪和排除?

FOR v, d, p IN 1..10 INBOUND "page/99999999999999" menu 
    LET filtered = CALL('udf::customFilter',v,p) 
    LET result = CALL('udf::customVisitor',v,d,p) 
RETURN {filtered:filtered,result:result}

Will the performance be affected if I use the UDF as is and pass the result in a LET param and exclude (filter) them manually? 如果我照原样使用UDF并将结果传递给LET参数并手动排除(过滤),会不会影响性能?

Generally speaking you can decide on "prune", "exclude" when you write the filter based on the path object (in your case p ) Here the optimizer will recognize that any longer path cannot fulfill a certain condition. 一般来说"prune", "exclude"当您基于path对象(在您的情况下为p )编写过滤器时"prune", "exclude"可以决定"prune", "exclude"在这里,优化器将识别出任何更长的路径都无法满足特定条件。 Examples here are: 示例如下:

FILTER p.edges[1].type == 'FOO'
FILTER p.edges[*].label ALL == 'BAR'
FILTER p.vertices[*].age ALL >= 18

First will prune whenever the second edge does not have type FOO . 每当第二条边的类型不为FOO时,第一条将进行修剪。 The second will prune whenever it finds a label != BAR etc. Only specific depth checks or global checks ALL , NONE , ANY can be recognized by the optimizer. 只要找到label != BAR等,第二个就会修剪。优化器只能识别特定的深度检查或全局检查ALLNONEANY

You can decide on "exclude" if you define the filter on the vertex or edge output, in your case v and d : 如果是在vd的情况下,是在vertexedge输出上定义过滤器,则可以决定"exclude"

FILTER d.type != "BAR"
FILTER v.name == "BAZ"

The first will exclude all edges that have type "BAR", the second will only include vertices having name "BAZ". 第一个将排除所有类型为“ BAR”的边,第二个将仅包括名称为“ BAZ”的顶点。 In both cases the traversal will go on. 在这两种情况下,遍历都会继续。

Right now there is no option to say PRUNE, INCLUDE . 目前没有选择说PRUNE, INCLUDE

Using a UDF to implement the filtering only is extremely bad for performance. 仅使用UDF实施过滤对性能非常不利。 That is because the UDF is a "black box" for AQL and especially cannot be optimized into the Traversal for pruning. 这是因为UDF是AQL的“黑匣子”,特别是无法将其优化为遍历以进行修剪。 Still the performance of the AQL traversal is orders of magnitude better in our internal tests that's why we decided to go that way. 在我们的内部测试中,AQL遍历的性能仍然要好几个数量级,这就是我们决定采用这种方式的原因。

Unfortunately UDF functions are slightly more flexible than AQL only, so there may be some functions that cannot be translated to FILTER only statements. 不幸的是,UDF函数比仅AQL稍微灵活一些,因此可能有些函数无法转换为FILTER only语句。 However there still is an option to execute these Traversals in the same way as before 3.0 by just defining the entire Traversal as a user-defined function. 但是,仍然可以通过将整个遍历定义为用户定义的函数来以与3.0之前相同的方式执行这些遍历。 This should have identical performance as before (the high-level algorithm is identical, but we changed many other internal parts in 3.0 which have performance-side-effects here). 它应该具有与以前相同的性能(高级算法是相同的,但是我们在3.0中更改了许多其他内部部件,这些部件在这里具有性能副作用)。

This is explained in more detail here: https://docs.arangodb.com/3.1/Manual/Graphs/Traversals/UsingTraversalObjects.html 这在这里有更详细的解释: https : //docs.arangodb.com/3.1/Manual/Graphs/Traversals/UsingTraversalObjects.html

And your new UDF should roughly look like this and take the startVertex as input: 您的新UDF应该大致如下所示,并以startVertex作为输入:

var db = require("internal").db;
var traversal = require("@arangodb/graph/traversal");
var config = {
  datasource: traversal.collectionDatasource("menu"),
  filter: db._aqlfunctions.document("UDF::CUSTOMFILTER").code,
  visitor: db._aqlfunctions.document("UDF::CUSTOMVISITOR").code,
  maxDepth: 1 // has to be defined
};
var result = {
  visited: {
    vertices: [ ],
    paths: [ ]
  }
};
var traverser = new traversal.Traverser(config);
traverser.traverse(result, startVertex);
[...] // Do stuff with result here

If you require some help with translation from UDF to FILTER or to get the full traversal UDF up and running please contact us directly via the https://groups.google.com/forum/#!forum/arangodb . 如果您需要从UDF到FILTER的转换方面的帮助,或者要启动并运行完整的遍历UDF,请通过https://groups.google.com/forum/#!forum/arangodb直接与我们联系。 We may require some mails to sort out all details you need. 我们可能需要一些邮件来整理您需要的所有详细信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM