简体   繁体   中英

How can I speed up this recursive function?

I have a recursion function that builds a node list from an IEnumerable of about 2000 records. The procedure currently takes around 9 seconds to complete and has become a major performance issue. The function serves to:

a) sort the nodes hierarchically

b) calculate the depth of each node

This is a stripped down example:

public class Node
{
    public string Id { get; set; }
    public string ParentId { get; set; }
    public int Depth { get; set; }
}

private void GetSortedList()
{
// next line pulls the nodes from the DB, not included here to simplify the example         
IEnumerable<Node> ie = GetNodes();

    var l = new List<Node>();
    foreach (Node n in ie)
    {
        if (string.IsNullOrWhiteSpace(n.ParentId))
        {
            n.Depth = 1;
            l.Add(n);
            AddChildNodes(n, l, ie);
        }
    }
}

private void AddChildNodes(Node parent, List<Node> newNodeList, IEnumerable<Node> ie)
{
    foreach (Node n in ie)
    {
        if (!string.IsNullOrWhiteSpace(n.ParentId) && n.ParentId == parent.Id)
        {
            n.Depth = parent.Depth + 1;
            newNodeList.Add(n);
            AddChildNodes(n, newNodeList, ie);
        }
    }
}

What would be the best way to rewrite this to maximize performance? I've experimented with the yield keyword but I'm not sure that will get me the result I am looking for. I've also read about using a stack but none of the examples I have found use parent IDs (they use child node lists instead), so I am a little confused on how to approach it.

Recursion is not what is causing your performance problem. The real problem is that on each recursive call to AddChildNodes , you traverse the entire list to find the children of the current parent, so your algorithm ends up being O(n^2).

To get around this, you can create a dictionary that, for each node Id, gives a list of all its children. This can be done in a single pass of the list. Then, you can start with the root Id ("") and recursively visit each of its children (ie a "depth first traversal"). This will visit each node exactly once. So the entire algorithm is O(n). Code is shown below.

After calling GetSortedList , the sorted result is in result . Note that you could make children and result local variables in GetSortedList and pass them as parameters to DepthFirstTraversal , if you prefer. But that unnecessarily slows down the recursive calls, since those two parameters would always have the same values on each recursive call.

You can get rid of the recursion using stacks, but the performance gain would probably not be worth it.

Dictionary<string, List<Node>> children = null; 
List<Node> result = null;

private void GetSortedList()
{
    var ie = data;
    children = new Dictionary<string,List<Node>>();

    // construct the dictionary 
    foreach (var n in ie) 
    {
        if (!children.ContainsKey(n.ParentId)) 
        {
            children[n.ParentId] =  new List<Node>();
        }
        children[n.ParentId].Add(n);
    }

    // Depth first traversal
    result = new List<Node>();
    DepthFirstTraversal("", 1);

    if (result.Count() !=  ie.Count()) 
    {
        // If there are cycles, some nodes cannot be reached from the root,
        // and therefore will not be contained in the result. 
        throw new Exception("Original list of nodes contains cycles");
    }
}

private void DepthFirstTraversal(string parentId, int depth)
{
    if (children.ContainsKey(parentId))
    {
        foreach (var child in children[parentId])
        {
            child.Depth = depth;
            result.Add(child);
            DepthFirstTraversal(child.Id, depth + 1);
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM