[英]Why is double GroupBy + ToList taking too long?
我有这些查询:
var Data = (from ftr in db.TB_FTR
join mst in db.TB_MST on ftr.MST_ID equals mst.MST_ID
join trf in db.TB_TRF on mst.TRF_ID equals trf.ID
select new CityCountyType { City = ftr.CITY, County = ftr.COUNTY, Type = trf.TYPE }
).OrderBy(i => i.City).ThenBy(i => i.County);
var Data2 =
Data.GroupBy(i => new {i.City, i.County, i.Type})
.Select(group => new {Name = group.Key, Count = group.Count()})
.OrderBy(x => x.Name)
.ThenByDescending(x => x.Count)
.GroupBy(g => new {g.Name.City, g.Name.County})
.Select(g => g.Select(g2 =>
new {Name = new {g.Key.City, g.Key.County, g2.Name.Type}, g2.Count})).ToList();
我正在尝试获取县和城市相同的对象列表。 但是第二个查询花费的时间太长,无法给出结果。 我等待了大约30分钟,但没有答案,但是列表Data
有大约5000条记录。 如何更改这些查询,以便获得所需列表的列表? 提前致谢。
例如,此查询返回这样的列表:
{ Name = {{ City = New York City, County = Bronx, Type = Type A }}, Count = 4 }
{ Name = {{ City = New York City, County = Bronx, Type = Type B }}, Count = 8 }
{ Name = {{ City = New York City, County = Bronx, Type = Type C }}, Count = 24 }
{ Name = {{ City = New York City, County = Manhattan, Type = Type B }}, Count = 43 }
{ Name = {{ City = New York City, County = Manhattan, Type = Type C }}, Count = 58 }
{ Name = {{ City = Seattle, County = King County, Type = Type D }}, Count = 43 }
{ Name = {{ City = Seattle, County = King County, Type = Type A }}, Count = 67 }
{ Name = {{ City = Seattle, County = Snohomish County, Type = Type C }}, Count = 67 }
我想将此列表添加到几个这样的列表中:
清单1:
{ Name = {{ City = New York City, County = Bronx, Type = Type A }}, Count = 4 }
{ Name = {{ City = New York City, County = Bronx, Type = Type B }}, Count = 8 }
{ Name = {{ City = New York City, County = Bronx, Type = Type C }}, Count = 24 }
清单2:
{ Name = {{ City = New York City, County = Manhattan, Type = Type B }}, Count = 43 }
{ Name = {{ City = New York City, County = Manhattan, Type = Type C }}, Count = 58 }
清单3:
{ Name = {{ City = Seattle, County = King County, Type = Type D }}, Count = 43 }
{ Name = {{ City = Seattle, County = King County, Type = Type A }}, Count = 67 }
清单4:
{ Name = {{ City = Seattle, County = Snohomish County, Type = Type C }}, Count = 67 }
可能性1:您的数据库未编制索引以支持您的查询(where和join子句)。
要找出答案,请获取生成的sql并查看执行计划。 如果计划中显示嵌套循环联接->聚集索引扫描,则说明存在问题。
可能性2:您发现了n + 1问题。
在Linq的GROUP BY
,一个组由组键和组成员组成。 但是,在大多数SQL实现中, GROUP BY
会为您提供组键和聚合。 为了获得组的成员,将发出一个单独的查询。 如果有n个组,则必须发出n个查询(+1是原始查询)。
要找出答案,请获取生成的sql。 如果发出了很多额外的查询,并且其中任何一个都显示了聚集索引扫描,那么您已经找到了问题。
情况3:您实际上是在发出n ^ 2(〜5,000,000)条查询。
好吧,您分组了两次,所以它可能是一个双重嵌套循环。 查看生成的sql并找出答案。
最简单的解决方法是在分组之前将5,000条记录拉入内存。 一种简单的方法是在调用GroupBy
之前先调用ToList
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.