简体   繁体   English

Neo4j遍历框架超时

[英]Timeout on Neo4j traversal framework

I have a very large graph with hundreds of millions of nodes and relationships, where I need to make a traversal to find if a specific node is connected with another one containing a particular property. 我有一个包含数亿个节点和关系的非常大的图,需要遍历以查找特定节点是否与包含特定属性的另一个节点连接。
The data is highly interconnected, and for a pair of nodes there can be multiple relationships linking them. 数据是高度互连的,并且对于一对节点,可以存在多个链接它们的关系。

Given that this operation needs to run on a real time system I have very strict time constraints, requiring no more than 200ms to find possible results. 鉴于此操作需要在实时系统上运行,因此我有非常严格的时间限制,需要不超过200毫秒的时间才能找到可能的结果。

So I have created the following TraversalDescriptor: 因此,我创建了以下TraversalDescriptor:

TraversalDescription td = graph.traversalDescription()
                           .depthFirst()
                           .uniqueness(Uniqueness.NODE_GLOBAL)
                           .expand(new SpecificRelsPathExpander(requiredEdgeProperty)
                           .evaluator(new IncludePathWithTargetPropertyEvaluator(targetNodeProperty));

The Evaluator checks for each path if the end node is my target, including and pruning the path if that's the case or excluding it and continuing if it's not. 如果最终节点是我的目标,则评估程序将检查每个路径,如果是,则包括并修剪该路径,如果不是,则排除该路径,然后继续运行。 Also, I set a limit on the time spent traversing and the maximum number of results to find. 此外,我设置了遍历时间和查找结果的最大数量的限制。 Everything can be seen in the code below: 一切都可以在下面的代码中看到:

private class IncludePathWithTargetPropertyEvaluator implements Evaluator {

private String targetProperty;
private int results;
private long startTime, curTime, elapsed;       

public IncludePathWithTargetPropertyEvaluator(String targetProperty) {
    this.targetProperty = targetProperty;
    this.startTime = System.currentTimeMillis();
    this.results = 0;
}

public Evaluation evaluate(Path path) {

    curTime = System.currentTimeMillis();
    elapsed = curTime - startTime;

    if (elapsed >= 200) {
        return Evaluation.EXCLUDE_AND_PRUNE;
    }

    if (results >= 3) {
        return Evaluation.EXCLUDE_AND_PRUNE;
    }

    String property = (String) path.endNode().getProperty("propertyName");

    if (property.equals(targetProperty)) {
        results = results + 1;
        return Evaluation.INCLUDE_AND_PRUNE;
    }

    return Evaluation.EXCLUDE_AND_CONTINUE;
}

Finally I written a custom PathExpander because each time we need to traverse only edges with a specific property value: 最后,我编写了一个自定义PathExpander,因为每次我们只需要遍历具有特定属性值的边时:

private class SpecificRelsPathExpander implements PathExpander {

private String requiredProperty; public SpecificRelsPathExpander(String requiredProperty) { this.requiredProperty = requiredProperty; } public Iterable<Relationship> expand(Path path, BranchState<Object> state) { Iterable<Relationship> rels = path.endNode().getRelationships(RelTypes.FOO, Direction.BOTH); if (!rels.iterator().hasNext()) return null; List<Relationship> validRels = new LinkedList<Relationship>(); for (Relationship rel : rels) { String property = (String) rel.getProperty("propertyName"); if (property.equals(requiredProperty)) { validRels.add(rel); } } return validRels; } // not used public PathExpander<Object> reverse() { return null; }

The issue is that the traverser keep going even long after the 200ms have passed. 问题是,经过200毫秒后,遍历器仍继续运行很长时间。

From what I understood the evaluator behavior is to enqueue all following branches for each path evaluated with EXCLUDE_AND_CONTINUE, and the traverser itself won't stop until it has visited all subsequent paths in the queue. 据我了解,评估者的行为是为使用EXCLUDE_AND_CONTINUE评估的每个路径排队所有随后的分支,并且遍历器本身不会停止,直到它访问了队列中的所有后续路径。
So what can happen is: if I have even few nodes with a very high degree it will result in thousands of paths to be traversed. 因此可能发生的事情是:如果我什至没有几个高度很高的节点,它将导致遍历数千条路径。

In that case, is there a way to make the traverser abruptly stop when the timeout has been reached and return possible valid paths that have been found in the while? 在这种情况下,是否有办法使遍历器在达到超时时突然停止并返回可能在一段时间内找到的有效路径?

I would go with the following line of thought: 我会遵循以下思路:

Once the timeout was elapsed stop expanding the graph. 一旦超时结束,就停止展开图形。

private class SpecificRelsPathExpander implements PathExpander {

private String requiredProperty;
private long startTime, curTime, elapsed;

public SpecificRelsPathExpander(String requiredProperty) {
    this.requiredProperty = requiredProperty;
    this.startTime = System.currentTimeMillis();
}

public Iterable<Relationship> expand(Path path, BranchState<Object> state) {
    curTime = System.currentTimeMillis();
    elapsed = curTime - startTime;

    if (elapsed >= 200) {
        return null;
    }

    Iterable<Relationship> rels = path.endNode().getRelationships(RelTypes.FOO, Direction.BOTH);
    if (!rels.iterator().hasNext())
        return null;
    List<Relationship> validRels = new LinkedList<Relationship>();
    for (Relationship rel : rels) {
        String property = (String) rel.getProperty("propertyName");
        if (property.equals(requiredProperty)) {
            validRels.add(rel);
        }
    }
    return validRels;
}

// not used
public PathExpander<Object> reverse() {
    return null;
}

I think taking a look at Neo4J TraversalDescription Definition might be beneficial for you too. 我认为看看Neo4J TraversalDescription Definition可能对您也有好处。

I would implement the expander to keep the lazy nature of the traversal framework, also for it's simpler code. 我将实现扩展器以保持遍历框架的惰性,这也是因为它的代码更简单。 This would prevent the traversal eagerly collecting all relationships for a node, Like this: 这样可以防止遍历急于收集节点的所有关系,如下所示:

public class SpecificRelsPathExpander implements PathExpander, Predicate<Relationship>
{
    private final String requiredProperty;

    public SpecificRelsPathExpander( String requiredProperty )
    {
        this.requiredProperty = requiredProperty;
    }

    @Override
    public Iterable<Relationship> expand( Path path, BranchState state )
    {
        Iterable<Relationship> rels = path.endNode().getRelationships( RelTypes.FOO, Direction.BOTH );
        return Iterables.filter( this, rels );
    }

    @Override
    public boolean accept( Relationship relationship )
    {
        return requiredProperty.equals( relationship.getProperty( "propertyName", null ) );
    }

    // not used
    @Override
    public PathExpander<Object> reverse()
    {
        return null;
    }
}

Also the traversal will continue as long as the client, ie the one holding the Iterator received from starting the traversal calls hasNext/next. 只要客户端(即持有从开始遍历调用接收到的Iterator的客户端)具有hasNext / next,遍历也将继续。 There will be no traversal on its own, it all happens in hasNext/next. 本身不会有遍历,所有这些都发生在hasNext / next中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM