查询性能：CTE 使用 ROW_NUMBER() 到 select 第一行

Question

We have three environments and when I run my SQL query in two of them just takes 30 or 38 seconds to run but in the other environment running is not completed and I should cancel it.我们有三个环境，当我在其中两个环境中运行我的 SQL 查询时只需要 30 或 38 秒即可运行，但在另一个环境中运行未完成，我应该取消它。 Query is based on two parts, a CTE and a very simple select from a table, in both CTE and select I'm using the same table.查询基于两部分，一个 CTE 和一个非常简单的 select 来自一个表，在 CTE 和 select 中我都使用同一个表。

Could you please tell me why it takes so long time?你能告诉我为什么需要这么长时间吗？ how can I improve the query?如何改进查询？

ALTER VIEW [fact].[vPurchase] 
AS
    WITH VKPL AS 
    (
        SELECT * 
        FROM
            (SELECT 
                 iv.[Delivery_FK],
                 1 AS column2,
                 ROW_NUMBER() OVER(PARTITION BY [Delivery_FK] ORDER BY iv.UpdateDate) AS rk     
             FROM 
                 [fact].[KRMFact] iv   
             LEFT JOIN 
                 [dimension].[Product] pr ON iv.Product_FK =pr.Product_SK
             LEFT JOIN 
                 [dimension].[Delivery] le ON le.Delivery_FK = iv.Delivery_FK 
             WHERE 
                 pr.Product_Key = '740') X
        WHERE 
            rk = 1
    )
    SELECT 
         -- ....  here are some columns
         Delivery_FK,
         Product_FK,
         CAST(column2 AS VARCHAR) AS column2,
         f.[UpdateDate] AS [Update date]
     FROM 
         [fact].[KRMFact] f
     LEFT JOIN 
         VKPL v ON f.Delivery_FK = v.Delivery_FK

Answer 1

This is guesswork.这是猜测。

I guess the environment where this query is slow is the one with lots of production data in it.我猜这个查询速度慢的环境是其中包含大量生产数据的环境。
I guess some index on your KRMFact table will, maybe, help you.我猜你的KRMFact表上的一些索引可能会对你有所帮助。 Here's how to figure out what you need: SQL Server Management Studio (SSMS) has a feature to show you a query's execution plan.以下是确定您需要什么的方法： SQL Server Management Studio (SSMS) 具有向您显示查询执行计划的功能。 Put your query ( not simplified, please, the actual query ) into SSMS, right click and choose "Include Actual Execution Plan."将您的查询（请不要简化，请实际查询）放入 SSMS，右键单击并选择“包括实际执行计划”。 Then run the query.然后运行查询。 The execution plan display may recommend an index for you to create to get this query to run faster.执行计划显示可能会建议您创建一个索引，以使该查询运行得更快。
I guess you're trying to find rows with the earliest values of UpdateDate .我猜您正在尝试查找具有最早值的行UpdateDate 。

Your subquery你的子查询

SELECT * 
  FROM
      (SELECT 
              iv.[Delivery_FK],
              1 AS column2,
              ROW_NUMBER() OVER(PARTITION BY [Delivery_FK] ORDER BY iv.UpdateDate) AS rk     
        FROM 
              [fact].[KRMFact] iv   
        LEFT JOIN 
             [dimension].[Product] pr ON iv.Product_FK =pr.Product_SK
        LEFT JOIN 
             [dimension].[Delivery] le ON le.Delivery_FK = iv.Delivery_FK 
       WHERE 
             pr.Product_Key = '740') X
 WHERE 
       rk = 1

looks like it picks out the row with the earliest KRMFact.UpdateDate for each value of KRMFact.Delivery_FK .看起来它为KRMFact.UpdateDate的每个值挑选出最早的KRMFact.Delivery_FK行。 That's what the ROW_NUMBER() OVER... WHERE rk=1 language does.这就是ROW_NUMBER() OVER... WHERE rk=1语言所做的。

If my guess about that is correct you can do that a different way, which may be more efficient.如果我对此的猜测是正确的，您可以采取不同的方式，这可能会更有效。

SELECT * 
  FROM
      (SELECT 
              iv.[Delivery_FK],
              1 AS column2,
              1 AS rk
        FROM 
              [fact].[KRMFact] iv   
        JOIN (   SELECT Delivery_FK, MIN(UpdateDate) first_update
                   FROM [fact].[KRMFact]
                 GROUP BY Delivery_FK
             ) first_update ON iv.UpdateDate = first_update.first_update
        LEFT JOIN 
             [dimension].[Product] pr ON iv.Product_FK =pr.Product_SK
        LEFT JOIN 
             [dimension].[Delivery] le ON le.Delivery_FK = iv.Delivery_FK 
       WHERE 
             pr.Product_Key = '740') X
 WHERE 
       rk = 1

You should probably try out the old and new versions of the subquery to determine whether they will yield the same results.您可能应该尝试新旧版本的子查询，以确定它们是否会产生相同的结果。

If you use this subquery query I suggest, this index will help make it run faster by optimizing the new sub-sub-query's MIN()... GROUP BY operation.如果您使用我建议的这个子查询查询，这个索引将通过优化新的子子查询的MIN()... GROUP BY操作来帮助它运行得更快。

CREATE INDEX x_KRMFact_Product_Update 
          ON [fact].[KRMFact]
             ([Product_FK],[UpdateDate])

By the way, WHERE pr.Product_Key = '740' turns your LEFT JOIN [dimension].[Product] operation into an ordinary inner JOIN.顺便说一句， WHERE pr.Product_Key = '740'将您的LEFT JOIN [dimension].[Product]操作转换为普通的内部 JOIN。

查询性能：CTE 使用 ROW_NUMBER() 到 select 第一行

问题描述

1 个解决方案

解决方案1
0 2021-03-06 14:00:02

查询性能：CTE 使用 ROW_NUMBER() 到 select 第一行

问题描述

1 个解决方案

解决方案1 0 2021-03-06 14:00:02

解决方案1
0 2021-03-06 14:00:02