简体   繁体   中英

how to improve LInQ performance repository pattern

I have using the generic repository pattern in WebAPI with database as postgreSQL. The transaction table has 300 000 to 1 000 000 data. For Reporting purpose, I have to take the count of transaction data join other two table. The LinQ query loads around 1.5 mins. to provide the data. How to optimize or improve the performance?

var data = (from emp in (await new Repository<emp>().GetAll()).ToList()
    join trans1 in (await new Repository<trans1>().GetAll()).ToList()
    on emp.staffid equals trans1?.leadstaffid
    join trans2 in (await new Repository<trans2>().GetAll())
    on trans1.statusid equals trans2.statusid
    into tassta
    from ts in tassta.DefaultIfEmpty()
    group new { emp, trans1, ts }
    by new { emp.staffid, emp.fullname } into grp
    select new ReportTs
    {
        particulars = grp.FirstOrDefault().emp.fullname.Trim(),
        id = grp.FirstOrDefault().trans1.id,
        staffid = grp.FirstOrDefault().trans1.staffid,
        PByDep = grp.Where(ys => ys.trans1.statusid == 2).Select(ys1 => ys1.trans1.statusid).Count(),
        PFT = grp.Where(ys => ys.trans1.statusid == 3).Select(ys1 => ys1.trans1.statusid).Count(),
        PByC = grp.Where(ys => ys.trans1.statusid == 4).Select(ys1 => ys1.trans1.statusid).Count(),
        PFR = grp.Where(ys => ys.trans1.statusid == 5).Select(ys1 => ys1.trans1.statusid).Count(),
        inid = grp.FirstOrDefault().trans1.inid,
        rowtotal = grp.Count(ys => ys.trans1.statusid == null) +
                  grp.Count(ys => ys.trans1.statusid == 2) +
                  grp.Count(ys => ys.trans1.statusid == 3) +
                  grp.Count(ys => ys.trans1.statusid == 4) +
                  grp.Count(ys => ys.trans1.statusid == 5) ,
        PApp = true,
        CDate = false
    }).Distinct().ToList();

The problem here is that you are running the query in memory after you have fetched all of the data from the database.

I'm supposing that you have a table called 'emp' in your db. When you do

 (await new Repository<emp>().GetAll()).ToList()

you are moving all of the data from that table to your application memory. This is the same as

SELECT * FROM emp

Of course this takes quite a lot if you have a lot of tuples in there.

After you have fetched all the data you are using Linq to run an in memory query against those data.

To improve the performance you have to remove that 'ToList()' from the first and second lines, which materializes the data.

After you have done that you have to rewrite the query because the one you have written is not translatable to a SQL query.

Your goal should be to have a query that can be run against your DB so that you fetch only the data you need.

---EDIT---

Here you have two examples. In the first one all of the data will be fetched from the db and then the query will be run in memory (as you are doing now). In the second one the query will be run in the database and you'll fetch only the desired data.

 public class Repository<T>()
 {
      public Task<IQueryable<T>> GetAll(){...}
 }

 public class Examples 
 {

      public async static Task Example1()
      {
           var repository = new Repository<emp>();

           var emps =  await repository.GetAll().ToList();

           var reports = from emp in emps
           where emp.Id > 10
           select new ReportData(){
                ...
           }


      }

      public async static Task Example2()
      {
           var repository = new Repository<emp>();

           var emps =  await repository.GetAll();

           var reports = (from emp in emps
           where emp.Id > 10
           select new ReportData(){
                ...
           }).ToList();


      }
 }

Keep in mind that although the syntax is very similar those two examples do very different things.

In the first case the Linq query will be compiled in foreach loops, in the second case it will end up in a query for postgres. This means that in this case you cannot call methods that cannot be translated to a postgres query (ie particulars = grp.FirstOrDefault().emp.fullname.Trim())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM