简体   繁体   中英

LINQ Query takes too long

I am running this query :

List<RerocdByGeo>  reports = (from p in db.SOMETABLE.ToList()
       where (p.colID == prog && p.status != "some_string" && p.col_date < enddate && p.col_date > startdate)
       group p by new {
           country = (
           (p.some_integer_that_represents_an_id <= -0) ? "unknown" : (from f in db.A_LAGE_TABLE where (f.ID ==  p.some_integer_that_represents_and_id) select f.COUNTRIES_TABLE.COU_Name).FirstOrDefault()),
           p.status }
           into g
    select new TransRerocdByGeo
       {
           ColA = g.Sum(x => x.ColA),
           ColB = g.Sum(x => x.ColB),
           Percentage = (g.Sum(x => x.ColA) != null && g.Sum(x => x.ColA) != 0) ? (g.Sum(x => x.ColB) / g.Sum(x => x.ColA)) * 100 : 0,
           Status = g.Key.status,
           Country = g.Key.country
       }).ToList();

a similar query in sql for the same database would run for a few seconds while this one takes about 30-60 seconds in the good case...

the table SOMETABLE contains abount 10-60 K rows and the table called here A_LARGE_TABLE contains about 10-20 mill rows

the coulmn some_inteher_that_reoresents_an_id is the id on the large table but can also be 0 or -1 and than needs to get the "unknown" value, so i cannot make a relationship (or can i ? if so please explain)

the COUNTRIES_TABLE contains 100-200 rows.

the coulID and ID are identity columns ...

any suggestions ?

You're calling ToList on "SOMETABLE" right at the start. This is pulling the entire database table, with all rows and all columns, into memory and then performing all of the subsequent operations via Linq-to-objects on that in-memory data structure.

Not only are you suffering the penalty of transferring way more information across the network than you need to (which is slow), but C# isn't able to perform the operations nearly as efficiently as a database. That's partly because it looses access to any indexes, any database caching, any cached compiled queries, it isn't as efficient at dealing with data sets that large to begin with, and any higher level optimizations of the query itself (databases tend to do a lot of that).

Next, you have a query inside of your GroupBy clause from f in db.A_LAGE_TABLE where [...] that is performed for every row in the sequence . If the entire query is evaluated at the database level that would potentially be optimized, and even if it's not you're not going across the network to pass information (which is quite slow) for each record .

from p in db.SOMETABLE.ToList()

This basically says "get every record from SOMETABLE and put it in a List ", without filtering at all. This is probably your problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM