简体   繁体   English

Linq查询超时,如何简化查询

[英]Linq query timing out, how to streamline query

Our front end UI has a filtering system that, in the back end, operates over millions of rows. 我们的前端UI具有一个过滤系统,该系统在后端可处理数百万行。 It uses a an IQueryable that is built up over the course of the logic, then executed all at once. 它使用一个在逻辑过程中建立的IQueryable,然后立即执行所有操作。 Each individual UI component is ANDed together (for example, Dropdown1 and Dropdown2 will only return rows that have both of what is selected in common). 每个单独的UI组件都进行“与”运算(例如,Dropdown1和Dropdown2将仅返回具有相同选择的行)。 This is not a problem. 这不是问题。 However, Dropdown3 has has two types of data in it, and the checked items need to be ORd together, then ANDed with the rest of the query. 但是,Dropdown3中包含两种类型的数据,并且需要对选中的项进行“或”运算,然后与其余查询进行“与”运算。

Due to the large amount of rows it is operating over, it keeps timing out. 由于要进行大量操作,因此会一直超时。 Since there are some additional joins that need to happen, it is somewhat tricky. 由于还需要进行其他一些联接,因此有些棘手。 Here is my code, with the table names replaced: 这是我的代码,替换了表名:

//The end list has driver ids in it--but the data comes from two different places. Build a list of all the driver ids.
driverIds = db.CarDriversManyToManyTable.Where(
                        cd =>
                            filter.CarIds.Contains(cd.CarId) && //get driver IDs for each car ID listed in filter object
                            ).Select(cd => cd.DriverId).Distinct().ToList();

driverIds = driverIds.Concat(
                    db.DriverShopManyToManyTable.Where(ds => filter.ShopIds.Contains(ds.ShopId)) //Get driver IDs for each Shop listed in filter object
                        .Select(ds => ds.DriverId)
                        .Distinct()).Distinct().ToList();
//Now we have a list solely of driver IDs

//The query operates over the Driver table. The query is built up like this for each item in the UI. Changing from Linq is not an option.
query = query.Where(d => driverIds.Contains(d.Id));

How can I streamline this query so that I don't have to retrieve thousands and thousands of IDs into memory, then feed them back into SQL? 如何简化此查询,以便不必将成千上万的ID检索到内存中,然后将其反馈回SQL?

There are several ways to produce a single SQL query. 产生单个SQL查询的方法有多种。 All they require to keep the parts of the query of type IQueryable<T> , ie do not use ToList , ToArray , AsEnumerable etc. methods that force them to be executed and evaluated in memory. 它们所需要的所有内容都必须保持IQueryable<T>类型的查询,即不要使用ToListToArrayAsEnumerable等方法来强制它们在内存中执行和评估。

One way is to create Union query containing the filtered Ids (which will be unique by definition) and use join operator to apply it on the main query: 一种方法是创建包含过滤后的ID(根据定义将是唯一的)的Union查询,并使用join运算符将其应用于主查询:

var driverIdFilter1 = db.CarDriversManyToManyTable
    .Where(cd => filter.CarIds.Contains(cd.CarId))
    .Select(cd => cd.DriverId);
var driverIdFilter2 = db.DriverShopManyToManyTable
    .Where(ds => filter.ShopIds.Contains(ds.ShopId))
    .Select(ds => ds.DriverId);
var driverIdFilter = driverIdFilter1.Union(driverIdFilter2);
query = query.Join(driverIdFilter, d => d.Id, id => id, (d, id) => d);

Another way could be using two OR-ed Any based conditions, which would translate to EXISTS(...) OR EXISTS(...) SQL query filter: 另一种方法是使用两个基于OR-ed Any的条件,这将转换为EXISTS(...) OR EXISTS(...) SQL查询过滤器:

query = query.Where(d =>
    db.CarDriversManyToManyTable.Any(cd => d.Id == cd.DriverId && filter.CarIds.Contains(cd.CarId))
    ||
    db.DriverShopManyToManyTable.Any(ds => d.Id == ds.DriverId && filter.ShopIds.Contains(ds.ShopId))
);

You could try and see which one performs better. 您可以尝试看看哪个效果更好。

The answer to this question is complex and has many facets that, individually, may or may not help in your particular case. 这个问题的答案很复杂,有很多方面,个别情况可能会也可能不会有帮助。

First of all, consider using pagination. 首先,考虑使用分页。 .Skip(PageNum * PageSize).Take(PageSize) I doubt your user needs to see millions of rows at once in the front end. .Skip(PageNum * PageSize).Take(PageSize)我怀疑您的用户需要一次在前端看到数百万行。 Show them only 100, or whatever other smaller number seems reasonable to you. 只给他们看100个,或者对您来说合理的其他较小数字。

You've mentioned that you need to use joins to get the data you need. 您已经提到需要使用联接来获取所需的数据。 These joins can be done while forming your IQueryable (entity framework), rather than in-memory (linq to objects). 这些连接可以在形成IQueryable(实体框架)时完成,而不是在内存中(对对象的限制)完成。 Read up on join syntax in linq. 在linq中阅读联接语法。

HOWEVER - performing explicit joins in LINQ is not the best practice, especially if you are designing the database yourself. 但是,在LINQ中执行显式联接不是最佳实践,尤其是在您自己设计数据库时。 If you are doing database first generation of your entities, consider placing foreign-key constraints on your tables. 如果您正在执行数据库的第一代实体,请考虑在表上放置外键约束。 This will allow database-first entity generation to pick those up and provide you with Navigation Properties which will greatly simplify your code. 这将允许数据库优先实体生成来拾取它们,并为您提供导航属性,这将大大简化您的代码。

If you do not have any control or influence over the database design, however, then I recommend you construct your query in SQL first to see how it performs. 但是,如果您对数据库设计没有任何控制或影响,那么建议您首先使用SQL构造查询以查看其性能。 Optimize it there until you get the desired performance, and then translate it into an entity framework linq query that uses explicit joins as a last resort. 在那里进行优化,直到获得所需的性能,然后将其转换为使用显式连接作为最后手段的实体框架linq查询。

To speed such queries up, you will likely need to perform indexing on all of the "key" columns that you are joining on. 为了加快此类查询的速度,您可能需要对要加入的所有“关键”列进行索引。 The best way to figure out what indexes you need to improve performance, take the SQL query generated by your EF linq and bring it on over to SQL Server Management Studio. 找出需要哪些索引以提高性能的最佳方法,使用EF linq生成的SQL查询,并将其带入SQL Server Management Studio。 From there, update the generated SQL to provide some predefined values for your @p parameters just to make an example. 从那里开始,更新生成的SQL,以为您的@p参数提供一些预定义的值,仅作为示例。 Once you've done this, right click on the query and either use display estimated execution plan or include actual execution plan. 完成此操作后,右键单击查询,然后使用显示估算的执行计划或包括实际的执行计划。 If indexing can improve your query performance, there is a pretty good chance that this feature will tell you about it and even provide you with scripts to create the indexes you need. 如果建立索引可以改善查询性能,则很有可能该功能将告诉您有关该索引的信息,甚至可以为您提供创建所需索引的脚本。

It looks to me that using the instance versions of the LINQ extensions is creating several collections before you're done. 在我看来,使用LINQ扩展的实例版本会在完成之前创建多个集合。 using the from statement versions should cut that down quite a bit: 使用from语句版本应将其削减很多:

driveIds = (from var record in db.CarDriversManyToManyTable
            where filter.CarIds.Contains(record.CarId)
            select record.DriverId).Concat
            (from var record in db.DriverShopManyToManyTable
             where filter.ShopIds.Contains(record.ShopId)
             select record.DriverId).Distinct()

Also using the groupby extension would give better performance than querying each driver Id. 与查询每个驱动程序ID相比,使用groupby扩展还将提供更好的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM