[英]Select max value of each group using partition by
I have the following code which is taking a looong time to get executed. 我有以下代码,需要花费很长时间才能执行。 What I need to do is select the column having row number equals 1 after partitioning it by three columns (col_1, col_2, col_3) [which are also the key columns] and ordering by some columns as mentioned below. 我需要做的是在将它分成三列(col_1,col_2,col_3)[也是关键列]并按下面提到的某些列排序后,选择行号等于1的列。 The number of records in the table is around 90 million. 表中的记录数约为9000万。 Am I following the best approach or is there any other better one? 我是按照最好的方法还是还有其他更好的方法吗?
with cte as (SELECT
b.*
,ROW_NUMBER() OVER ( PARTITION BY col_1,col_2,col_3
ORDER BY new_col DESC, new_col_2 DESC, new_col_3 DESC ) AS ROW_NUMBER
FROM (
SELECT
*
,CASE
WHEN update_col = ' ' THEN new_update_col
ELSE update_col
END AS new_col_1
FROM schema_name.table_name
) b
)
select top 10 * from cte WHERE ROW_NUMBER=1
Currently you are applying CASE on different columns which is impacting all rows in the database table. 目前,您在不同的列上应用CASE,这会影响数据库表中的所有行。 CASE (String Comparison) Is a costly method. CASE(字符串比较)是一种代价高昂的方法。
At the end, you are keeping only records with ROW NUMBER = 1. If I guess this filter keeping Half of your all records, this will increase the query execution time if you filter (Generate ROW NUMBER First and Keep Rows with RN=1) first and then apply CASE method on columns. 最后,你只保留ROW NUMBER = 1的记录。如果我猜这个过滤器保留了所有记录的一半,如果你过滤(生成ROW NUMBER优先并保持RN = 1的行),这将增加查询执行时间首先,然后在列上应用CASE方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.