简体   繁体   English

使用LINQ有效地分页大型数据集

[英]Efficiently paging large data sets with LINQ

When looking into the best ways to implement paging in C# (using LINQ), most suggestions are something along these lines: 在研究使用C#(使用LINQ)实现分页的最佳方法时,大多数建议如下:

// Execute the query
var query = db.Entity.Where(e => e.Something == something);

// Get the total num records
var total = query.Count();

// Page the results
var paged = query.Skip((pageNum - 1) * pageSize).Take(pageSize);

This seems to be the commonly suggested strategy (simplified). 这似乎是通常建议的策略(简化)。

For me, my main purpose in paging is for efficiency. 对我来说,分页的主要目的是提高效率。 If my table contains 1.2 million records where Something == something, I don't want to retrieve all of them at the same time. 如果我的表包含120万条其中某物==的记录,则我不想同时检索所有记录。 Instead, I want to page the data, grabbing as few records as possible. 相反,我想分页数据,抓取尽可能少的记录。 But with this method, it seems that this is a moot point. 但是用这种方法,似乎这是一个有争议的问题。

If I understand it correctly, the first statement is still retrieving the 1.2 million records, then it is being paged as necessary. 如果我理解正确,则第一条语句仍在检索120万条记录, 然后将在必要时分页。

Does paging in this way actually improve performance? 以这种方式分页是否真的可以提高性能? If the 1.2 million records are going to be retrieved every time, what's the point (besides the obvious UI benefits)? 如果要每次检索120万条记录,那又有什么意义(除了明显的UI好处之外)?

Am I misunderstanding this? 我误会吗? Any .NET gurus out there that can give me a lesson on LINQ, paging, and performance (when dealing with large data sets)? 是否有任何.NET专家可以教我有关LINQ,分页和性能的知识(在处理大型数据集时)?

The first statement does not execute the actual SQL query, it only builds part of the query you intend to run. 第一条语句不执行实际的SQL查询,它仅构建您要运行的查询的一部分。

It is when you call query.Count() that the first will be executed 当您调用query.Count() ,第一个将被执行

SELECT COUNT(*) FROM Table WHERE Something = something

On query.Skip().Take() won't execute the query either, it is only when you try to enumerate the results(doing a foreach over paged or calling .ToList() on it) that it will execute the appropriate SQL statement retrieving only the rows for the page (using ROW_NUMBER). query.Skip().Take()也不会执行查询,只有当您尝试枚举结果(对paged进行foreach调用或调用.ToList() )时,它将执行适当的SQL语句仅检索页面的行(使用ROW_NUMBER)。

If watch this in the SQL Profiler you will see that exactly two queries are executed and at no point it will try to retrieve the full table. 如果在SQL事件探查器中观察到此情况,您将看到恰好执行了两个查询,并且它绝不会尝试检索完整表。


Be careful when you are using the debugger, because if you step after the first statement and try to look at the contents of query that will execute the SQL query. 要小心,当您使用调试器,因为如果你的第一条语句后退一步,尝试看看的内容query ,将执行SQL查询。 Maybe that is the source of your misunderstanding. 也许这就是您误会的根源。

// Execute the query
var query = db.Entity.Where(e => e.Something == something);

For your information, nothing is called after the first statement. 仅供参考,在第一条语句之后什么都没有调用

// Get the total num records
var total = query.Count();

This count query will be translated to SQL, and it'll make a call to database. 此计数查询将转换为SQL,并将调用数据库。 This call will not get all records, because the generated SQL is something like this: 此调用不会获取所有记录,因为生成的SQL如下所示:

SELECT COUNT(*) FROM Entity where Something LIKE 'something'

For the last query, it doesn't get all the records neither . 对于最后一个查询, 它也不会获取所有记录 The query will be translated into SQL, and the paging run in the database. 该查询将转换为SQL,并且分页在数据库中运行。

Maybe you'll find this question useful: efficient way to implement paging 也许您会发现这个问题有用: 实现分页的有效方法

I believe Entity Framework might structure the SQL query with the appropriate conditions based on the linq statements. 我相信实体框架可以根据linq语句以适当的条件构造SQL查询。 (eg using ROWNUMBER() OVER ...). (例如,使用ROWNUMBER()OVER ...)。

I could be wrong on that, however. 但是,我对此可能是错的。 I'd run SQL profiler and see what the generated query looks like. 我将运行SQL事件探查器,并查看生成的查询是什么样的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM