简体   繁体   English

如何在Redshift中使用window function?

[英]How to use window function in Redshift?

I have 2 tables: |我有 2 个表: | Product |产品 | |:----: | |:----: | | | product_id |产品编号 | | | source_id|来源编号|

Source来源
source_id源码
priority优先

sometimes there are cases when 1 product_id can contain few sources and my task is to select data with min priority from for example |有时在某些情况下,1 个 product_id 可以包含很少的来源,而我的任务是 select 数据,优先级最低,例如 | product_id |产品编号 | source_id|来源编号| priority|优先级| |:----: |:------:|:-----:| |:----: |:------:|:-----:| | | 10| 10| 2| 2| 9| 9| | | 10| 10| 4| 4| 2| 2| | | 20| 20| 2| 2| 9| 9| | | 20| 20| 4| 4| 2| 2| | | 30| 30| 2| 2| 9| 9| | | 30| 30| 4| 4| 2| 2|

correct result should be like: |正确的结果应该是这样的:| product_id |产品编号 | source_id|来源编号| priority|优先级| |:----: |:------:|:-----:| |:----: |:------:|:-----:| | | 10| 10| 4| 4| 2| 2| | | 20| 20| 4| 4| 2| 2| | | 30| 30| 4| 4| 2| 2|

I am using query:我正在使用查询:

SELECT p.product_id, p.source_id, s.priority FROM Product p
INNER JOIN Source s on s.source_id = p.source_id
WHERE s.priority = (SELECT Min(s1.priority) OVER (PARTITION BY p.product_id) FROM Source s1)

but it returns error "this type of correlated subquery pattern is not supported yet" so as i understand i can't use such variant in Redshift, how should it be solved, are there any other ways?但它返回错误“尚未支持这种类型的相关子查询模式”所以据我所知我不能在 Redshift 中使用这种变体,应该如何解决,还有其他方法吗?

You just need to unroll the where clause into the second data source and the easiest flag for min priority is to use the ROW_NUMBER() window function. You're asking Redshift to rerun the window function for each JOIN ON test which creates a lot of inefficiencies in clustered database.您只需要将 where 子句展开到第二个数据源中,最小优先级的最简单标志是使用 ROW_NUMBER() window function。您要求 Redshift 为每个 JOIN ON 测试重新运行 window function,这会创建很多集群数据库效率低下。 Try the following (untested):尝试以下(未经测试):

SELECT p.product_id, p.source_id, s.priority 
FROM Product p
INNER JOIN (
    SELECT ROW_NUMBER() OVER (PARTITION BY p.product_id, order by s1.priority) as row_num,
        source_id,
        priority
    FROM Source) s 
on s.source_id = p.source_id
WHERE row_num = 1

Now the window function only runs once.现在 window function 只运行一次。 You can also move the subquery to a CTE if that improve readability for your full case.如果提高整个案例的可读性,您还可以将子查询移动到 CTE。

Already found best solution for that case:已经为这种情况找到了最佳解决方案:

SELECT
  p.product_id
, p.source_id
, s.priority
, Min(s.priority) OVER (PARTITION BY p.product_id) as min_priority
FROM Product p
    INNER JOIN Source s
            ON s.source_id = p.source_id
WHERE s.priority = p.min_priority

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM