简体   繁体   English

优化慢速LINQ查询

[英]Optimizing slow LINQ query

I needed to optimize the following loop that takes 20 seconds to run: 我需要优化以下循环,该循环需要20秒才能运行:

    foreach (IGrouping<DateTime, DateTime> item in groups)
    {
        var countMatchId = initialGroups
                        .Where(grp => CalculateArg(grp.a.Arg) == item.Key && grp.b.Arg == someId)
                        .Sum(y => y.c.Value);

        var countAll = initialGroups
                        .Where(grp => CalculateArg(grp.a.Arg) == item.Key)
                        .Sum(y => y.c.Value);
    }

...where CalculateArg is a relatively expensive function. ...其中CalculateArg是相对昂贵的函数。 I thought, CalculateArg must be the culprit therefore should only be used in one query, so I came up with this: 我以为,CalculateArg必须是罪魁祸首,因此只能在一个查询中使用,所以我想到了:

    foreach (IGrouping<DateTime, DateTime> item in groups)
    {
        var result = initialGroups
                        .Where(grp => CalculateArg(grp.a.Arg) == item.Key);

        var countMatchId = result
                        .Where(x => x.c.Arg == someId).Sum(y => y.c.Value);

        var countAll = result
                        .Sum(y => y.c.Value);

The problem with this result, is that it only saves about 200milliseconds, so that didn't optimize anything. 结果的问题是,它只节省了大约200毫秒,因此没有进行任何优化。 I still have for countMatchId the .Where() that iterates all elements, and the .Sum() which also iterates them all. 对于countMatchId,我仍然具有对所有元素进行迭代的.Where()和也对所有元素进行迭代的.Sum() And then another .Sum() for countAll iterates all elements. 然后另一个用于countAll的.Sum()迭代所有元素。

How could I optimize this further? 我该如何进一步优化? I'm sure there is something obvious that I'm missing. 我敢肯定我很想念一些东西。

var result = initialGroups
                    .Where(grp => CalculateArg(grp.a.Arg) == item.Key);

This isn't cached. 这没有被缓存。

foreach (var x in result) {} 
foreach (var x in result) {} 
foreach (var x in result) {} 
foreach (var x in result) {} 

will recalculate everything 4 times. 将重新计算所有内容4次。

Do it this way: 这样做:

var result = initialGroups
                    .Where(grp => CalculateArg(grp.a.Arg) == item.Key)
                    .ToArray();

I guess this may improve it partly: 我想这可能会部分改善它:

foreach (IGrouping<DateTime, DateTime> item in groups)
{
    var common  =   initialGroups
                    .GroupBy(grp => {
                            var c = CalculateArg(grp.a.Arg);
                            return (c == item.Key && grp.b.Arg == someId) ? 1 :
                                    c == item.Key ? 2 : 3;
                            })
                    .OrderBy(g=>g.Key)
                    .Select(g=>g.Sum(c=>c.Value)).ToList();
    var countMatchId = common[0];
    var countAll = common[0] + common[1];
}

Now there are couple of things we need to consider in this question. 现在,在这个问题中我们需要考虑几件事。 First of all, where are your data coming from? 首先,您的数据来自哪里? Is it coming from an entity that is created by dbcontext? 它来自dbcontext创建的实体吗? If yes, you need to consider accessing and manipulating your data with Context instead of using a navigation property of objects. 如果是,则需要考虑使用Context而不是使用对象的导航属性来访问和操纵数据。 What do i mean by that? 那是什么意思? Consider two classes below, 考虑下面的两个类,

public class User{

   public int ID { get;set; } 
   public virtual ICollection<Animal> Animals {get;set;} 

}


public class Animal{
    public int ID { get; set; }
    public string Name {get;set;}
    [ForeignKey("Owner")]
    public int? Owner_ID {get;set;}
    public virtual User Owner {get;set;}
}

Now instead of accessing animals of the user with code below, 现在,不用使用下面的代码访问用户的动物,

User user = Context.User.Single(t=> t.ID == 1);
List<Animal> animals = user.Animals.ToList();

accessing with directly dbcontext is much much more efficient. 直接使用dbcontext进行访问的效率要高得多。 (There is performance considerations should be taking into account if your list have like 100k entity and trying to get it into memory with ToList method. (如果您的列表具有类似100k的实体,并尝试使用ToList方法将其存储到内存中,则应考虑到性能方面的考虑。

List<Animal> animals = Context.Animals.Where(t => t.Owner_ID == 1).ToList();

Besides, if you are not using any ORM framework, try getting all the computational objects into memory and cache them all. 此外,如果您不使用任何ORM框架,请尝试将所有计算对象放入内存并将其全部缓存。 This will make a huge performance improvement because accessing an object that is already in memory is much easier than the object in a Queryable list. 这将大大提高性能,因为访问内存中的对象比查询列表中的对象容易得多。 In your case groups object might be a queryable object that's why your performance is not that much good. 在您的情况下, 对象可能是可查询的对象,这就是为什么您的性能不太好的原因。

If there are lots of item s in groups you may benefit from changing the algorithm around. 如果groups有很多item ,您可能会受益于更改算法。

Instead of iterating, try calculating things once & GroupJoin the results together, ala 与其进行迭代,不如尝试一次计算,然后将结果结合在一起,丙

var calculated = initialGroups
  .Select(group => new { Group = group, Arg = CalculateArg(group.a.Arg) })
  .ToList();

var sumCollection = groups
  .GroupJoin(calculated,
             item => item.Key,
             group => group.Arg,
      (group, calculatedCollection) =>
         new {
            Group = group,
            SumAll = calculatedCollection.Sum(y => y.Group.c.Value),
            SumMatchId = calculatedCollection
                         .Where(y => y.Group.b.Arg == someId)
                         .Sum(y => y.Group.c.Value)
         });

foreach (var item in sumCollection)
{
    item.SumAll     // you get the idea
    item.SumMatchId // 
}

I found a way to fix it: following the helpful comments to the question, I profiled almost every line of the foreach with a Stopwatch and found out that indeed, the CalculateArg() function was the culprit - calling it was adding 500ms for each iteration; 我找到了一种解决它的方法:在对该问题进行有益的评论之后,我用秒表对了foreach的几乎每一行进行了分析,发现实际上CalculateArg()函数是罪魁祸首-调用它每次迭代都会增加500ms ; on a 40 item collection, this meant a total of 20000 ms = 20 seconds. 在40个项目的集合中,这意味着总计20000 ms = 20秒。

What I did was to move the calculation outside the foreach, meaning the groups (anonymous object made with SelectMany), now also includes the result of CalculateArg() for each element. 我所做的就是将计算移到foreach之外,这意味着 (用SelectMany制作的匿名对象)现在也包括每个元素的CalculateArg()结果。 which brings the code to: 将代码带到:

foreach (IGrouping<DateTime, DateTime> item in groups)
{
    var result = initialGroups
                    .Where(grp => grp.calculatedArg == item.Key);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM