[英]SQL lag to row which meets condition
I have a table which contains measures taken on random dates, partitioned by the site at which they were taken.我有一个表格,其中包含在随机日期采取的措施,按采取措施的地点划分。
site地点 | date日期 | measurement测量 |
---|---|---|
AB1234 AB1234 | 2022-12-09 2022-12-09 | 1 1个 |
AB1234 AB1234 | 2022-06-11 2022-06-11 | 2 2个 |
AB1234 AB1234 | 2019-05-22 2019-05-22 | 3 3个 |
AB1234 AB1234 | 2017-01-30 2017-01-30 | 4 4个 |
CD5678 CD5678 | 2022-11-01 2022-11-01 | 5 5个 |
CD5678 CD5678 | 2020-04-10 2020-04-10 | 6 6个 |
CD5678 CD5678 | 2017-04-10 2017-04-10 | 7 7 |
CD5678 CD5678 | 2017-01-22 2017-01-22 | 8 8个 |
In order to calculate a year on year growth, I want to have an additional field for each record which contains the previous measurement at that site.为了计算同比增长,我想为每条记录添加一个字段,其中包含该站点的先前测量值。 The challenging part is that I only want the previous which occurred more than a year in the past.具有挑战性的部分是我只想要过去一年多的前一个。
Like so:像这样:
site地点 | date日期 | measurement测量 | previous_measurement以前的测量 |
---|---|---|---|
AB1234 AB1234 | 2022-12-09 2022-12-09 | 1 1个 | 3 3个 |
AB1234 AB1234 | 2022-06-11 2022-06-11 | 2 2个 | 3 3个 |
AB1234 AB1234 | 2019-05-22 2019-05-22 | 3 3个 | 4 4个 |
AB1234 AB1234 | 2017-01-30 2017-01-30 | 4 4个 | NULL NULL |
CD5678 CD5678 | 2022-11-01 2022-11-01 | 5 5个 | 6 6个 |
CD5678 CD5678 | 2020-04-10 2020-04-10 | 6 6个 | 7 7 |
CD5678 CD5678 | 2017-04-10 2017-04-10 | 7 7 | NULL NULL |
CD5678 CD5678 | 2017-01-22 2017-01-22 | 8 8个 | NULL NULL |
It feels like it should be possible with a window function, but I can't work it out.感觉用window function应该可以,但是我想不通。
Please help:(请帮忙:(
Amazon Athena engine version 3 incorporated from Trino. Amazon Athena 引擎版本 3从 Trino 合并。 If it has incorporated full support for frame type RANGE for window functions you can use that:如果它已完全支持 window 函数的帧类型 RANGE ,您可以使用它:
-- sample data
with dataset(site, date, measurement) as (
values ('AB1234', date '2022-12-09', 1),
('AB1234', date '2022-06-11', 2),
('AB1234', date '2019-05-22', 3),
('AB1234', date '2017-01-30', 4),
('CD5678', date '2022-11-01', 5),
('CD5678', date '2020-04-10', 6),
('CD5678', date '2017-04-10', 7),
('CD5678', date '2017-01-22', 8)
)
-- query
select *,
last_value(measurement) over (
partition by site
order by date
RANGE BETWEEN UNBOUNDED PRECEDING AND interval '1' year PRECEDING)
from dataset;
Output: Output:
site地点 | date日期 | measurement测量 | _col3 _col3 |
---|---|---|---|
CD5678 CD5678 | 2017-01-22 2017-01-22 | 8 8个 | NULL NULL |
CD5678 CD5678 | 2017-04-10 2017-04-10 | 7 7 | NULL NULL |
CD5678 CD5678 | 2020-04-10 2020-04-10 | 6 6个 | 7 7 |
CD5678 CD5678 | 2022-11-01 2022-11-01 | 5 5个 | 6 6个 |
AB1234 AB1234 | 2017-01-30 2017-01-30 | 4 4个 | NULL NULL |
AB1234 AB1234 | 2019-05-22 2019-05-22 | 3 3个 | 4 4个 |
AB1234 AB1234 | 2022-06-11 2022-06-11 | 2 2个 | 3 3个 |
AB1234 AB1234 | 2022-12-09 2022-12-09 | 1 1个 | 3 3个 |
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.