简体   繁体   中英

Does Entity Framework query the database multiple times if I use different fields of the same Linq query at different times?

I tried the Internet and the SOF but couldn't locate a helpful resource. Perhaps I may not be using correct wording to search. If there are any previous questions I have missed due to this reason please let me know and I will take this question down.


I am dealing with a busy database so I am required to send less queries to the database.

If I access different columns of the same Linq query from different levels of the code then is Entity Framework smart enough to foresee the required columns and bring them all or does it call the db twice?

eg.

var query = from t1 in table_1
            join t2 in table_2 on t1.col1 equals t2.col1
            where t1.EmployeeId == EmployeeId
            group new { t1, t2 } by t1.col2 into grouped
            orderby grouped.Count() descending
            select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };

var records = query.Take(10);

// point x
var x = records.Select(a => a.Column1).ToArray();

var y = records.Select(a => a.Column2).ToArray();

Does EF generate query the database twice to faciliate x and y (send a query first to get Column1, and then send another to get Column2) or is it smart enough to know it needs both Columns to be materialised and bring them both at point x ?

Added to clarify the intention of the question:

I understand I can simply add a greedy method to the end of query.Take(10) and get it done but I am trying to understand if the approach I try (and in my opinion, more elegant) does work of if not what makes EF to make two queries please.

Yes currently your code will generate 2 queries that will be executed to the database. Reason being is because you have 2 different sqls generated:

  1. First is the top query, taking only 10 records and then only Column1
  2. Second is the top query, taking only 10 records and then only Column2

The reason these are 2 queries is because you have a ToArray over different Select statements -> generating different sql. Most of linq queries are differed executed and will be executed only when you use something like ToArray() / ToList() / FirstOrDefault() and so on - those that actually give you the concrete data. In your original query you have 2 different ToArray on data that has not yet been retrieved - meaning 2 queries (once for the first field and then for the second).

The following code will result in a single query to the database

var records = (from t1 in table_1
               join t2 in table_2 on t1.col1 equals t2.col1
               where t1.EmployeeId == EmployeeId
               group new { t1, t2 } by t1.col2 into grouped
               orderby grouped.Count() descending
               select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) })
              .Take(10).ToList();

var x = records.Select(a => a.Column1).ToArray();
var y = records.Select(a => a.Column2).ToArray();

In my solution above I added a ToList() after filtering out only that data you need ( Take(10) ) and then at that point it will execute to the database. Then you have all the data in memory and you can do any other linq operation over it without it going again to the database.


Add to your code ToString() so you can check the generated sql at different points. Then you will understand when and what is being executed:

var query = from t1 in table_1
            join t2 in table_2 on t1.col1 equals t2.col1
            where t1.EmployeeId == EmployeeId
            group new { t1, t2 } by t1.col2 into grouped
            orderby grouped.Count() descending
            select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };
var generatedSql = query.ToString(); // Here you will see a query that brings all records

var records = query.Take(10);
generatedSql = query.ToString(); // Here you will see it taking only 10 records


// point x
var xQuery = records.Select(a => a.Column1);
generatedSql = xQuery.ToString(); // Here you will see only 1 column in query

// Still nothing has been executed to DB at this point

var x = xQuery.ToArray(); // And that is what will be executed here

// Now you are before second execution

var yQuery = records.Select(a => a.Column2);
generatedSql = yQuery.ToString(); // Here you will see only the second column in query

// Finally, second execution, now with the other column

var y = yQuery.ToArray();

When you are running linq statement on an entity in EF if only prepares the Select statement (thats why the type is IQueryable). The data is loaded lazily. When you try to use a value from that query then only the result gets evaluated using a enumerator.

So when you turn it to a collection (.toList() etc.) explicitly it tries to get data to populate the list and hence the sql command is fired.

It is designed so to enhance the performance. So if a particular property of an entity is to be used EF doesn't get the value for all the columns from that table

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM