简体   繁体   中英

IEnumerable from recursive method is 10x slower than same IEnumerable constructed with foreach

I do not understand why one IEnumerable.Contains() is faster than the other in the following snippet, even though they are identical.

public class Group
{
    public static Dictionary<int, Group> groups = new Dictionary<int, Group>();

    // Members, user and groups
    public List<string> Users = new List<string>();
    public List<int> GroupIds = new List<int>();

    public IEnumerable<string> AggregateUsers()
    {
        IEnumerable<string> aggregatedUsers = Users.AsEnumerable();
        foreach (int id in GroupIds)
            aggregatedUsers = aggregatedUsers.Concat(groups[id].AggregateUsers());
        return aggregatedUsers;
    }
}

static void Main(string[] args)
{
    for (int i = 0; i < 1000; i++)
        Group.groups.TryAdd(i, new Group());

    for (int i = 0; i < 999; i++)
        Group.groups[i + 1].GroupIds.Add(i);

    for (int i = 0; i < 10000; i++)
        Group.groups[i/10].Users.Add($"user{i}");

    IEnumerable<string> users = Group.groups[999].AggregateUsers();

    Stopwatch stopwatch = Stopwatch.StartNew();
    bool contains1 = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from recursive function was {contains1} and took {stopwatch.ElapsedMilliseconds} ms");

    users = Enumerable.Empty<string>();
    foreach (Group group in Group.groups.Values.Reverse())
        users = users.Concat(group.Users);

    stopwatch = Stopwatch.StartNew();
    bool contains2 = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from foreach was {contains2} and took {stopwatch.ElapsedMilliseconds} ms");

    Console.Read();
}

Here is the output obtained by executing this snippet:

Search through IEnumerable from recursive function was True and took 40 ms
Search through IEnumerable from foreach was True and took 3 ms

The snippet simulates 10,000 users distributed in 1,000 groups of 10 users each.

Each group can have 2 types of members, users (a string), or other groups (an int representing the ID of that group).

Each group has the previous group as a member. So group 0 has 10 users, group1 has 10 users and users from group 0, group 2 has 10 users and users of group 1 .. and here begins the recursion.

The purpose of the search is to determine if user "user0" (which is close to the end of the List) is a member of the group 999 (which through group relation contains all 10,000 users).

The question is, why is the search taking only 3 ms for the search through the IEnumerable constructed with foreach, and 10 times more, for the same IEnumerable constructed with the recursive method ?

An interesting question. When I compiled it in .NET Framework, the execution times were about the same (I had to change the TryAdd Dictionary method to Add).

In .NET Core I've got the same result as you observed.

I believe the answer is deferred execution. You can see in the debugger, that the

IEnumerable<string> users = Group.groups[999].AggregateUsers();

assignment to users variable will result in Concat2Iterator instance and the second one

users = Enumerable.Empty<string>();
foreach (Group group in Group.groups.Values.Reverse())
    users = users.Concat(group.Users);

will result in ConcatNIterator.

From the documentation of concat:

This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.

You can check out the code of concat here . The implementations of GetEnumerable for ConcatNIterator and Concat2Iterator are different.

So my guess is that the first query takes longer to evaluate because of the way you build the query using concat. If you try using ToList() on one of the enumerables like this:

IEnumerable<string> users = Group.groups[999].AggregateUsers().ToList();

you will see that the time elapsed will come down almost to 0 ms.

I figured out how to overcome the problem after reading Mikołaj's answer and Servy's comment. Thanks!

public class Group
{
    public static Dictionary<int, Group> groups = new Dictionary<int, Group>();

    // Members, user and groups
    public List<string> Users = new List<string>();
    public List<int> GroupIds = new List<int>();

    public IEnumerable<string> AggregateUsers()
    {
        IEnumerable<string> aggregatedUsers = Users.AsEnumerable();
        foreach (int id in GroupIds)
            aggregatedUsers = aggregatedUsers.Concat(groups[id].AggregateUsers());
        return aggregatedUsers;
    }

    public IEnumerable<string> AggregateUsers(List<IEnumerable<string>> aggregatedUsers = null)
    {
        bool topStack = false;
        if (aggregatedUsers == null)
        {
            topStack = true;
            aggregatedUsers = new List<IEnumerable<string>>();
        }
        aggregatedUsers.Add(Users.AsEnumerable());
        foreach (int id in GroupIds)
            groups[id].AggregateUsers(aggregatedUsers);

        if (topStack)
            return aggregatedUsers.SelectMany(i => i);
        else
            return null;
    }
}

static void Main(string[] args)
{
    for (int i = 0; i < 1000; i++)
        Group.groups.TryAdd(i, new Group());

    for (int i = 0; i < 999; i++)
        Group.groups[i + 1].GroupIds.Add(i);

    for (int i = 0; i < 10000; i++)
        Group.groups[i / 10].Users.Add($"user{i}");

    Stopwatch stopwatch = Stopwatch.StartNew();
    IEnumerable<string> users = Group.groups[999].AggregateUsers();
    Console.WriteLine($"Aggregation via nested concatenation took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    bool contains = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from nested concatenation was {contains} and took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    users = Group.groups[999].AggregateUsers(null);
    Console.WriteLine($"Aggregation via SelectMany took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    contains = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from SelectMany was {contains} and took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    users = Enumerable.Empty<string>();
    foreach (Group group in Group.groups.Values.Reverse())
        users = users.Concat(group.Users);
    Console.WriteLine($"Aggregation via flat concatenation took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    contains = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from flat concatenation was {contains} and took {stopwatch.ElapsedMilliseconds} ms");

    Console.Read();
}

Here are the results:

Aggregation via nested concatenation took 0 ms
Search through IEnumerable from nested concatenation was True and took 43 ms
Aggregation via SelectMany took 1 ms
Search through IEnumerable from SelectMany was True and took 0 ms
Aggregation via foreach concatenation took 0 ms
Search through IEnumerable from foreach concatenation was True and took 2 ms

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM