简体   繁体   English

大型LINQ分组查询,幕后发生了什么

[英]Large LINQ Grouping query, what's happening behind the scenes

Take the following LINQ query as an example. 以下面的LINQ查询为例。 Please don't comment on the code itself as I've just typed it to help with this question. 请不要对代码本身发表评论,因为我只是输入它来帮助解决这个问题。

The following LINQ query uses a 'group by' and calculates summary information. 以下LINQ查询使用“分组依据”并计算摘要信息。 As you can see there are numerous calculations which are being performed on the data but how efficient is LINQ behind the scenes. 正如您所看到的,有许多计算正在对数据执行,但LINQ在幕后的效率如何。

var NinjasGrouped = (from ninja in Ninjas 
    group pos by new { pos.NinjaClan, pos.NinjaRank } 
    into con 
    select new NinjaGroupSummary 
    { 
        NinjaClan = con.Key.NinjaClan, 
        NinjaRank = con.Key.NinjaRank, 
        NumberOfShoes = con.Sum(x => x.Shoes), 
        MaxNinjaAge = con.Max(x => x.NinjaAge), 
        MinNinjaAge = con.Min(x => x.NinjaAge), 
        ComplicatedCalculation = con.Sum(x => x.NinjaGrade) != 0 
        ? con.Sum(x => x.NinjaRedBloodCellCount)/con.Sum(x => x.NinjaDoctorVisits)
        : 0,
    ListOfNinjas = con.ToList() 
    }).ToList(); 
  1. How many times is the list of 'Ninjas' being iterated over in order to calculate each of the values? 为了计算每个值,“Ninjas”列表被重复多少次?
  2. Would it be faster to employ a foreach loop to speed up the execution of such a query? 使用foreach循环来加速执行这样的查询会更快吗?
  3. Would adding '.AsParallel()' after Ninjas result in any performance improvements? 在Ninjas之后添加'.AsParallel()'会导致任何性能提升吗?
  4. Is there a better way of calculating summery information for List? 是否有更好的方法来计算List的夏日信息?

Any advice is appreciated as we use this type of code throughout our software and I would really like to gain a better understanding of what LINQ is doing underneath the hood (so to speak). 任何建议都值得赞赏,因为我们在整个软件中使用这种类型的代码,我真的希望更好地了解LINQ在幕后做什么(可以这么说)。 Perhaps there is a better way? 也许有更好的方法?

Assuming this is a LINQ to Objects query: 假设这是一个LINQ to Objects查询:

  • Ninjas is only iterated over once; Ninjas只迭代一次; the groups are built up into internal concrete lists, which you're then iterating over multiple times (once per aggregation). 这些组被构建到内部具体列表中,然后您将多次迭代(每次聚合一次)。
  • Using a foreach loop almost certainly wouldn't speed things up - you might benefit from cache coherency a bit more (as each time you iterate over a group it'll probably have to fetch data from a higher level cache or main memory) but I very much doubt that it would be significant. 使用foreach循环几乎肯定不会加快速度 - 你可能会从缓存一致性中受益更多(因为每次迭代一个组时它可能不得不从更高级别的缓存或主内存中获取数据)但是我非常怀疑它会很重要。 The increase in pain in implementing it probably would be significant though :) 实施它的痛苦增加可能很重要:)
  • Using AsParallel might speed things up - it looks pretty easily parallelizable. 使用AsParallel 可能会加快速度 - 它看起来很容易并行化。 Worth a try... 值得一试...
  • There's not a much better way for LINQ to Objects, to be honest. 说实话,LINQ to Objects没有更好的方法。 It would be nice to be able to perform the aggregation as you're grouping, and Reactive Extensions would allow you to do something like that, but for the moment this is probably the simplest approach. 能够在分组时执行聚合会很好,并且Reactive Extensions允许您执行类似的操作,但目前这可能是最简单的方法。

You might want to have a look at the GroupBy post in my Edulinq blog series for more details on a possible implementation. 您可能希望查看我的Edulinq博客系列中的GroupBy帖子,了解有关可能实现的更多详细信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM