简体   繁体   中英

Understanding deferred execution performance

Suppose we have a list of students:

var students = new List<Student>(//fill with 5000 students);

We then find the youngest male student from this list:

Method 1:

var youngestMaleStudent = students.Where(s => s.Gender == "male").OrderBy(s => s.Age).First();

Console.WriteLine(youngestMaleStudent.Name);

Method 2:

var maleStudents = students.Where(s => s.Gender == "male").ToList()

var youngestMaleStudent = maleStudents.OrderBy(s => s.Age).First();

Console.WriteLine(youngestMaleStudent.Name);

I would think Method 1 should be more efficient as Method 2 creates a new list and moves everything into it, but presumably, this isn't a huge deal as copying memory is relatively fast? (though 5000 objects may start to weight things down)

But then I think, do they run differently at all performance-wise? How does LINQ process each step in Method 1, does it not need to copy everything into a list of some form in order to then start sorting (ordering) the data?

Linq deferred execution allows to enqueue, or to chain, differents parts of a query like select , where and order , as for SQL , which is executed when it is used, with a foreach or a ToList() for example.

This is obtained using fluent interface pattern .

What are the benefits of a Deferred Execution in LINQ?

Deferred Execution of LINQ Query (tutorialsteacher.com)

Deferred Vs Immediate Query Execution in LINQ (c-sharpcorner.com)

Therefore method 1 is faster because in method 2 ToList() executes the query and First() executes a new query. Thus this last can be about 2x time at worst without considering underlying caches and optimizations. Because it uses an executed query ( ToList() ) to do an orderby on it that is a second executed query ( First() ).

In other words, in method 1 , the query is executed only by the First() method call and all previous calls are deferred to prepare the final query for this process like adding parameters in a string (in the case of SQL and it's about the same thing for any other target). But in method 2 , the ToList() creates a List<> instance from an executed query that consumes time and memory, and next the First() call do another query on this list, that consumes time and memory again...

So it is important to check in the documentation of every Linq method if it is deferred or not.

Linq can be both a performance and spaghetti code killer as well as a black hole.

In method 2 .ToList() convert an IQuryable to IEnumerable and based on that, all data from database fetched and then students.Where(s => s.Gender == "male") condition apply on it in memory.

In method 1 youngestMaleStudent is IQuryable then the query students.Where(s => s.Gender == "male").OrderBy(s => s.Age).First(); processed on database side.

The result is method 1 performed better specially when your data is huge.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM