雪花条件 Window Function

Question

I am trying to write a window function to help me retrieve a single field from a second table that I need to join to my existing table.我正在尝试编写一个 window function 来帮助我从第二个表中检索一个字段，我需要将其连接到我现有的表中。 The issue is the only way to figure out which value out of many possible values is the correct one requires matching two IDs and then out of those options (options where IDs match) pulling the most recent one that is not before a date (which is different for each record and pulled from the initial table).问题是从许多可能的值中找出哪个值是正确的唯一方法需要匹配两个 ID，然后从这些选项（ID 匹配的选项）中提取不在日期之前的最近的一个（这是每条记录都不同，并从初始表中提取）。

Right now I have written:现在我已经写了：

Select distinct primary_id,
       first_value(desired_column) over partition by id_1, id_2, order by date desc)
From base_table
Left join second_table 
on second_table.id_1 = base_table.id_1 and 
   second_table.date <= base_table.date

However, this is still returning incorrect values.但是，这仍然返回不正确的值。 The returned table should have the same row count as base_table, but with the desired_column added based on whichever record matches ids but also happens before the base_table date (each desired_column value should be one result, the most recent one before the base_table date that matches the ids).返回的表应具有与 base_table 相同的行数，但根据与 ids 匹配的记录添加 desired_column，但也发生在 base_table 日期之前（每个 desired_column 值应该是一个结果，base_table 日期之前与ID）。 This has the same row count, but it's returning desired_column values that are completely incorrect (I suspect that it is because I don't break down the second date <= base in the window function directly, but that isn't possible? I'm not sure how to proceed.)这具有相同的行数，但它返回了完全不正确的 desired_column 值（我怀疑这是因为我没有直接分解 window function 中的第二个日期 <= base，但这是不可能的？我'我不确定如何进行。）

Thank you in advance.先感谢您。

Edit to add:编辑添加：

Sample Base Table示例基表

Primary Key首要的关键	ID1 ID1	ID2 ID2	Date日期
1 1个	123 123	321 321	01/22/2021 2021 年 1 月 22 日
2 2个	123 123	654 654	09/02/2022 09/02/2022
3 3个	234 234	432 432	02/02/2019 02/02/2019

Sample Second Table样本第二表

Desired_Column Desired_Column	ID1 ID1	ID2 ID2	Date日期
q q	123 123	321 321	01/21/2021 01/21/2021
r r	123 123	654 654	09/03/2022 09/03/2022
w w	234 234	432 432	02/01/2019 02/01/2019
s秒	234 234	432 432	03/20/2022 03/20/2022
a一种	123 123	439 439	02/20/2022 02/20/2022
w w	999 999	999 999	09/10/2022 2022 年 9 月 10 日
null null	234 234	987 987	10/10/2020 10/10/2020

Desired Output所需 Output

Primary Key首要的关键	ID1 ID1	ID2 ID2	Date日期	Desired_Column Desired_Column
1 1个	123 123	321 321	01/22/2021 2021 年 1 月 22 日	q q
2 2个	123 123	654 654	09/02/2022 09/02/2022	null null
3 3个	234 234	432 432	02/02/2019 02/02/2019	w w

Answer 1

so making some CTE's for the data:所以为数据制作一些 CTE：

with base_table(primary_id, ID_1, ID_2, Date) as (
    select * from values
    (1, 123, 321, '01/22/2021'::date),
    (2, 123, 654, '09/02/2022'::date),
    (3, 234, 432, '02/02/2019'::date)
), second_table(Desired_Column, ID_1, ID_2, Date) as (
    select * from values
    ('q'    ,123, 321, '01/21/2021'::date),
    ('r'    ,123, 654, '09/03/2022'::date),
    ('w'    ,234, 432, '02/01/2019'::date),
    ('s'    ,234, 432, '03/20/2022'::date),
    ('a'    ,123, 439, '02/20/2022'::date),
    ('w'    ,999, 999, '09/10/2022'::date),
    (null   ,234, 987, '10/10/2020'::date)
)

and then correcting your SQL:然后更正您的 SQL：

Select distinct b.primary_id,
       first_value(s.desired_column) over (partition by b.id_1, b.id_2 order by s.date desc)
From base_table as b
Left join second_table as s
on s.id_1 = b.id_1 and 
   s.date <= b.date

gives:给出：

PRIMARY_ID主 ID	FIRST_VALUE(S.DESIRED_COLUMN) OVER (PARTITION BY B.ID_1, B.ID_2 ORDER BY S.DATE DESC) FIRST_VALUE(S.DESIRED_COLUMN) OVER (PARTITION BY B.ID_1, B.ID_2 ORDER BY S.DATE DESC)
1 1个	q q
2 2个	a一种
3 3个	w w

but the distinct is the hint, this is not the method you are looking for...但不同的是提示，这不是您正在寻找的方法......

dropping the FIRST_VALUE which get a result for very row, and using a QUALIFY and ROW_NUMBER to RANK the rows, just keep the best (aka 1)删除获得非常行结果的FIRST_VALUE ，并使用QUALIFY和ROW_NUMBER对行进行排名，只保留最好的（又名 1）

Select b.primary_id,
       s.desired_column
From base_table as b
Left join second_table as s
on s.id_1 = b.id_1 and 
   s.date <= b.date
qualify row_number() over (partition by b.id_1, b.id_2 order by s.date desc) = 1

gives:给出：

PRIMARY_ID主 ID	DESIRED_COLUMN DESIRED_COLUMN
1 1个	q q
2 2个	a一种
3 3个	w w

but also allow accessing all the other values from the two tables:但也允许访问两个表中的所有其他值：

Select b.*,
       s.*   
From base_table as b
Left join second_table as s
on s.id_1 = b.id_1 and 
   s.date <= b.date
qualify row_number() over (partition by b.id_1, b.id_2 order by s.date desc) = 1

And given you want to match "both ID's" you should use this SQL:如果你想匹配“两个 ID”，你应该使用这个 SQL：

Select b.primary_id,
    b.id_1,
    b.id_2,
    s.desired_column 
From base_table as b
Left join second_table as s
on s.id_1 = b.id_1 and 
   s.id_2 = b.id_2 and 
   s.date <= b.date
qualify row_number() over (partition by b.id_1, b.id_2 order by s.date desc) = 1

雪花条件 Window Function

问题描述

1 个解决方案

解决方案1
0 2022-12-08 03:48:16

雪花条件 Window Function

问题描述

1 个解决方案

解决方案1 0 2022-12-08 03:48:16

解决方案1
0 2022-12-08 03:48:16