简体   繁体   中英

How to iterate at elements from a sub list and then remove the sub list from the list? With great performance

This is an example: originalList is a list of objects

var subList = (originalList.Where(x => x.number < 0)).ToList();
originalList.RemoveAll(x => x.number < 0);

I will use the subList later. In this example the originaList is iterated two times. This function is called billions of times and originalList is a large List

Is there a easy way to improve the performance?


One important thing: The value of number of the object can change between two calls of this function.

This method returns all elements fulfilling a condition and returns a list of the removed elements. It only iterates once.

public List<T> RemoveAll<T>(List<T> input, Func<T,bool> condition)
{
   List<T> result = new List<T>();
   for(int i = input.Count - 1; i >= 0; i--)
   {
      if(condition.Invoke(input[i]))
      {
         result.Add(input[i]);
         input.RemoveAt(i);
      }
   }
   return result;
}

A for loop is used, because the list is modified, making a foreach impossible. The list is iterated backwards, because otherwise the indices would not be correct anymore.

Online demo: https://dotnetfiddle.net/maox4A

An efficiency improvement (though still ultimately O(n)) is to batch the removals together. My testing shows that depending on the frequency of removal, this can be the same speed or over 4 times faster. Here is the function as an extension method:

public static List<T> RemoveAllAndReturn<T>(this List<T> input, Func<T, bool> condition) {
    List<T> result = new List<T>();
    var removeCt = 0;
    for (int i = input.Count - 1; i >= 0; --i) {
        if (condition(input[i])) {
            result.Add(input[i]);
            ++removeCt;
        }
        else if (removeCt > 0) {
            input.RemoveRange(i + 1, removeCt);
            removeCt = 0;
        }
    }
    if (removeCt > 0)
        input.RemoveRange(0, removeCt);
    return result;
}

You could consider doing this hack:

var subList = new List<SomeType>();
originalList.RemoveAll(x =>
{
    bool shouldBeRemoved = x.Number < 0;
    if (shouldBeRemoved) subList.Add(x);
    return shouldBeRemoved;
});

The Predicate<T> passed to the RemoveAll is not pure: it has the side-effect of inserting matched elements in the subList . Based on the implementation of the RemoveAll method, this hack should work as expected. The documentation though does not make the explicit guarantee that the predicate will be invoked only once per element:

The elements of the current List<T> are individually passed to the Predicate<T> delegate, and the elements that match the conditions are removed from the List<T> .

So make your own judgment whether it's safe to use this hack, or not.


Edit: You could also make it an extension method:

public static int RemoveAll<T>(this List<T> source, Predicate<T> match,
    out List<T> removed)
{
    var removedLocal = new List<T>();
    removed = removedLocal;
    return source.RemoveAll(x =>
    {
        bool shouldBeRemoved = match(x);
        if (shouldBeRemoved) removedLocal.Add(x);
        return shouldBeRemoved;
    });
}

Usage example:

originalList.RemoveAll(x => x.number < 0, out var subList);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM