简体   繁体   English

使用分区依据选择每个组的最大值

[英]Select max value of each group using partition by

I have the following code which is taking a looong time to get executed. 我有以下代码,需要花费很长时间才能执行。 What I need to do is select the column having row number equals 1 after partitioning it by three columns (col_1, col_2, col_3) [which are also the key columns] and ordering by some columns as mentioned below. 我需要做的是在将它分成三列(col_1,col_2,col_3)[也是关键列]并按下面提到的某些列排序后,选择行号等于1的列。 The number of records in the table is around 90 million. 表中的记录数约为9000万。 Am I following the best approach or is there any other better one? 我是按照最好的方法还是还有其他更好的方法吗?

  with cte as (SELECT
     b.*
    ,ROW_NUMBER() OVER ( PARTITION BY col_1,col_2,col_3
                         ORDER BY new_col DESC, new_col_2 DESC, new_col_3 DESC  ) AS ROW_NUMBER
  FROM (
    SELECT
       *
      ,CASE
         WHEN update_col = '        ' THEN new_update_col
         ELSE update_col
       END AS new_col_1
    FROM schema_name.table_name
    ) b
  )
 select top 10 * from cte WHERE ROW_NUMBER=1

Currently you are applying CASE on different columns which is impacting all rows in the database table. 目前,您在不同的列上应用CASE,这会影响数据库表中的所有行。 CASE (String Comparison) Is a costly method. CASE(字符串比较)是一种代价高昂的方法。

At the end, you are keeping only records with ROW NUMBER = 1. If I guess this filter keeping Half of your all records, this will increase the query execution time if you filter (Generate ROW NUMBER First and Keep Rows with RN=1) first and then apply CASE method on columns. 最后,你只保留ROW NUMBER = 1的记录。如果我猜这个过滤器保留了所有记录的一半,如果你过滤(生成ROW NUMBER优先并保持RN = 1的行),这将增加查询执行时间首先,然后在列上应用CASE方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM