简体   繁体   中英

Get first (or any) value from HashSet

Currently, my code looks like this:

    private List<Node> dirtyNodes = new List<Node> dirtyNodes();

    public void UpdateDirtyNodes()
    {
        while(dirtyNodes.Count > 0)
        {
            Node nodeToUpdate = dirtyNodes[0];
            nodeToUpdate.UpdateNode();
            dirtyNodes.Remove(nodeToUpdate);
        }
        dirtyNodes.Clear();
    }

    public void MarkNodeDirty(Node node)
    {
        if(!dirtyNodes.Contains(node))
        {
            dirtyNodes.Add(node);
        }
    }

    public void MarkNodeClean(Node node)
    {
        dirtyNodes.Remove(node);
    }

This is a performance-critical part of the code, and it's slower than I'd like because dirtyNodes.Contains has to iterate over the entire array in most cases. I'd like to replace the List with a HashSet because it should be faster, but I can't figure out how to make that work with UpdateDirtyNodes() .

The difficulty is that UpdateNode() can add or remove nodes from dirtyNodes at any time, hence the slightly awkward while loop. Is there a way I can get the "first" value from a HashSet? Order doesn't matter, I just need to stay in the while loop until dirtyNodes is empty, updating whatever node comes next.

I would prefer to avoid using Linq since this code will be part of a library and I don't want to force them to include Linq.

How can I do this?

Add a bool dirty field inside the Node class. This is in addition to keeping a hash set. Then MarkNodeClean() doesn't need to remove nodes from the HashSet, shaving off some CPU cycles.

If you feel that adding a field inside Node class is too "dirty" (pun intended), then just make a HashSet<(Node, bool)> instead of HashSet<Node> , but you are loosing performance on allocating and garbage-collecting extra objects, which is not ideal since your code is performance-critical.

UpdateDirtyNodes() will take nodes one at a time until the HashSet is empty. After taking each node, it will look at the boolean flag to decide whether the node is actually dirty.

PS You should remove dirtyNodes.Clear(); from UpdateDirtyNodes() . This is a race condition. If a node is added after the while loop finds that dirtyNodes.Count is 0, then dirtyNodes.Clear(); clears out that node without processing it. This is a separate bug, not related to your question.

Turns out it's very easy to just use the enumerator directly:

    public void UpdateDirtyNodes()
    {
        while(dirtyNodes.Count > 0)
        {
            using(HashSet<Node>.Enumerator enumerator = dirtyNodes.GetEnumerator())
            {
                if(enumerator.MoveNext())
                {
                    Node nodeToUpdate = enumerator.Current;
                    nodeToUpdate.UpdateNode();
                    dirtyNodes.Remove(nodeToUpdate);
                }
            }
        }
        dirtyNodes.Clear();
    }

    public void MarkNodeDirty(Node node)
    {
        dirtyNodes.Add(node);
    }

I originally tried something similar but didn't fully understand how to manually use enumerators and it didn't work.

It's significantly faster than the List (~25-50% faster depending on the situation) so I'm very happy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM