简体繁体 English

将“计算临时表中的中间结果”SQL模式调整为LINQ？

[英]Adapting “compute intermediate results in a temp table” SQL pattern to LINQ?

原文 2011-11-24 06:36:22 1 2 c#/ linq-to-sql/ sql-server-2008-r2

My team builds a C# web application that can generate ad-hoc reports that recombine data from the same core set of SQL Server 2008 R2 tables in different ways. 我的团队构建了一个C＃Web应用程序，可以生成特殊报告，以不同方式重新组合来自同一核心SQL Server 2008 R2表的数据。 For example, a single "dashboard" page may combine a list of today's sales in each region, a list of lowest-selling items and their trends over the last week, a list of top-performing salespeople, and 20+ other metrics and charts. 例如，单个“仪表板”页面可以组合每个区域的今天销售列表，最低销售项目列表及其上周的趋势，最佳销售人员列表以及20多个其他指标和图表。 Underneath the covers, a typical dashboard page will require at least 30 queries across 20+ different tables. 在封面下方，典型的仪表板页面将需要跨20多个不同表格的至少30个查询。

Unfortunately, it's not practical to "freeze" this data and pre-compute it-- we need to fetch real-time data on-the-fly. 不幸的是，“冻结”这些数据并预先计算它是不切实际的 - 我们需要即时获取实时数据。

To make these pages fast, our trick has been to identify different queries that pull the same underlying data. 为了使这些页面更快，我们的诀窍是识别引入相同底层数据的不同查询。 Then we compute intermediate results from those underlying tables, cache those results into temp tables, and then join that temp table to other tables to compute the final results. 然后我们计算那些基础表的中间结果，将这些结果缓存到临时表中，然后将该临时表连接到其他表以计算最终结果。 Using this approach we can typically reduce I/O and time required for a particular dashboard by a factor of 10. 使用这种方法，我们通常可以将特定仪表板所需的I / O和时间减少10倍。

Our team would like to apply this same pattern to a similar page that uses LINQ-to-SQL for data access. 我们的团队希望将相同的模式应用于使用LINQ-to-SQL进行数据访问的类似页面。 We like LINQ for programming ease-of-use, for unit testing, etc. But performance stinks for the kind of application described above where we execute multiple queries that may partially depend on the same underlying data. 我们喜欢LINQ用于编程易用性，用于单元测试等。但是上面描述的应用程序的性能很糟糕，我们执行可能部分依赖于相同底层数据的多个查询。

Of course I can call AsEnumerable() to materialize the intermediate query results, but if the intermediate results are large then getting the results in and out of SQL negates the performance win and creates inefficient parameterized queries with hundreds-item-long IN (@p1, ... ) clauses. 当然我可以调用AsEnumerable()来实现中间查询结果，但是如果中间结果很大，那么将结果输入和输出SQL会否定性能获胜并创建具有数百项长IN (@p1, ... )低效参数化查询IN (@p1, ... )条款。

In a perfect world, LINQ-to-SQL would offer an AsServerEnumerable() method which would create a temporary table of intermediate results that I could re-use downstream without leaving the DB. 在一个完美的世界中，LINQ-to-SQL将提供一个AsServerEnumerable()方法，该方法将创建一个临时的中间结果表，我可以在不离开数据库的情况下重新使用。

Does something like this exist? 这样的事情存在吗？

If not, got any suggestions for how to make our "server-side intermediate materialization" pattern work well on LINQ? 如果没有，有什么建议让我们的“服务器端中间实现”模式在LINQ上运行良好？

PS - I'm saying "temporary table" and not "table variable" above because temp tables tend to work better with more expensive queries (parallel query plans, non-clustered indexes, etc.). PS - 我说上面是“临时表”而不是“表变量”，因为临时表往往更适合更昂贵的查询（并行查询计划，非聚集索引等）。 But otherwise all of above would apply to table variables too. 但是否则以上所有内容也适用于表变量。

2 个解决方案

No, that does not exist in raw LINQ and is not pre-canned in any LINQ-style API I'm aware of. 不，这在原始LINQ中不存在，并且在我所知道的任何LINQ风格的API中都没有预先设定。

It can exist if you ignore the "LINQ" part of LINQ-to-SQL and just use the db.ExecuteQuery<T>(sql, args) approach, but if you do that you must take care to ensure you are passing an explicit and open connection to the data-context (if you use the connection-string approach, the connection management is handled automatically, and you are not guaranteed to get the same connection between operations - it could be taken from the pool, and as such even if it is the same underlying connection , it will have been reset, dropping any temp tables). 如果忽略LINQ-to-SQL的“LINQ”部分并且只使用db.ExecuteQuery<T>(sql, args)方法，它就可以存在，但是如果你这样做，你必须注意确保你传递一个明确的并打开与数据上下文的连接（如果使用连接字符串方法，则会自动处理连接管理，并且无法保证在操作之间获得相同的连接 - 它可以从池中获取，因此甚至如果它是相同的底层连接 ，它将被重置，删除任何临时表）。

Well, if you have a lot of reads you can consider creating a VIEW instead of a temporary table and add a clustered index to that view. 好吧，如果你有很多读取，你可以考虑创建一个VIEW而不是一个临时表，并为该视图添加一个聚簇索引。 This will materialize the view in the database. 这将实现数据库中的视图。

Indexed views can be used by SQL Server two different ways. SQL Server可以通过两种不同的方式使用索引视图。 First, the view can be called directly from a query, as conventional views are currently used. 首先，可以直接从查询调用视图，因为当前使用传统视图。 But instead of running the view's underlying SELECT statement and creating the view's result set on the fly, it uses the unique clustered index to display the results of the view almost immediately. 但是，它不是运行视图的基础SELECT语句并动态创建视图的结果集，而是使用唯一的聚簇索引几乎立即显示视图的结果。 Second, any query that is run on SQL Server 2000/2005 is automatically evaluated to see if any existing indexed views exist that would fulfill the query. 其次，将自动评估在SQL Server 2000/2005上运行的任何查询，以查看是否存在任何可满足查询的现有索引视图。 If so, the Query Optimizer uses the indexed query, even though it has not been specified in the query, greatly speeding the query. 如果是这样，查询优化器使用索引查询，即使它未在查询中指定，也会大大加快查询速度。