简体   繁体   中英

LINQ Query with subqueries VS Query and foreach

I have a very large amount of data that I need to gather for a report I am generating. All of this data comes from a Database that I am connected to via entity framework. For this query I have tried doing this a few different ways but no matter what I do it seems to be slow.

Overall I am curious if it is more efficient to have a LINQ query that has sub queries or is it better to do a foreach and then query for those values.

additional information for the DB a lot of the sub queries/loop iterations would be querying most of the largest tables in the DB.

Example code:

var b = (from brk in entities.Brokers
         join pcy in Policies on brk.BrkId equals pcy.pcyBrkId
         where pcy.DateStamp > twoYearsAgo
         select new returnData
         {
         BroId = brk.brkId,
         currentPrem = (from pcy in Policies
                        where pcy.PcyBrkID = brk.Brk.Id && pcy.InvDate > startDate && pcy.InvDate < endDate
                        select pcy.Premium).Sum(),
         //  5 more similar subqueries
         }).GroupBy(x=> x.BrkId).Select(x=> x.FirstOrDefault()).ToList();

OR

var b = (from brk in entities.Brokers
             join pcy in Policies on brk.BrkId equals pcy.pcyBrkId
             where pcy.DateStamp > twoYearsAgo
             select new returnData
             {
             BroId = brk.brkId
             }).GroupBy(x=> x.BrkId).Select(x=> x.FirstOrDefault()).ToList();
foreach( brk in b){
    // grab data from subqueries here
}

One additional detail may be that I may be able to filter out some additional information if I grab the primary information reducing the results to go through in the foreach.

First of all, matters of performance always warrant profiling, no matter how reasonable or logical one or another solution might seem.

Saying that, usually, while working with database, less trips you do to database is better. Hence in your case it might be more efficient to have one single SQL query that retrieves big chunk of data over network, and after you process it locally with loops and whatnot. This guideline has to be an optimal solution for most cases.

All, obviously, depends on how big that data is, how big your network bandwidth is, and how fast and tuned your database is.

Side note: in general, if you work with big, or complex (intertwined) data, better to avoid using Entity Framework at all, especially when you're concerned about performance. Not sure if that might work for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM