[英]Postgres overlapping date ranges for view
I have three tables 我有三张桌子
Table 1 Table 2 Table 3
start_date start_date start_date
end_date end_date end_date
val val val
Now lets say I have the following in the tables: 现在让我们说表中有以下内容:
Table 1
start_date end_date val
01-01-2000 31-01-2000 APPLE
01-02-2000 ORANGE
table 2
start_date end_date val
01-01-2000 15-01-2000 TOMATO
16-01-2000 LETTUCE
table 3
start_date end_date val
01-12-1999 CAR
I want the above three tables put into a view with min/max dates. 我想将以上三个表放入具有最小/最大日期的视图中。 Which would look like this:
看起来像这样:
start_date end_date val_table_1 val_table_2 val_table_3
01-12-1999 31-12-1999 null null CAR
01-01-2000 15-01-2000 APPLE TOMATO CAR
16-01-2000 31-01-2000 APPLE LETTUCE CAR
01-02-2000 ORANGE LETTUCE CAR
I was able to achieve the wanted result with the query below. 我可以通过以下查询实现所需的结果。 Also available as a demo here at SQL Fiddle .
在SQL Fiddle上也可以作为演示。 Leave out the final
order by
clause if creating a view, it was included here just to present the results sensibly. 如果创建视图,则省略最后的
order by
子句,此处仅包含此子句是order by
合理地显示结果。
PostgreSQL 9.6 Schema Setup : PostgreSQL 9.6模式设置 :
CREATE TABLE Table1
("start_date" timestamp, "end_date" timestamp, "val" varchar(20))
;
INSERT INTO Table1
("start_date", "end_date", "val")
VALUES
('2000-01-01 00:00:00', '2000-01-31', 'APPLE'),
('2000-02-01 00:00:00', NULL, 'ORANGE')
;
CREATE TABLE Table2
("start_date" timestamp, "end_date" timestamp, "val" varchar(20))
;
INSERT INTO Table2
("start_date", "end_date", "val")
VALUES
('2000-01-01', '2000-01-15', 'TOMATO'),
('2000-01-16', NULL, 'LETTUCE')
;
CREATE TABLE Table3
("start_date" timestamp, "end_date" timestamp, "val" varchar(3))
;
INSERT INTO Table3
("start_date", "end_date", "val")
VALUES
('1999-01-12 00:00:00', NULL, 'CAR')
;
Query 1 : 查询1 :
with ends as (
select end_date from Table1 where end_date is not null union
select end_date from Table2 where end_date is not null union
select end_date from Table3 where end_date is not null
)
select
d.start_date
, least(e.end_date, lead(d.start_date,1) over(order by d.start_date) - INTERVAL '1 DAY') as end_date
, table1.val as t1_val
, table2.val as t2_val
, table3.val as t3_val
from (
select start_date from Table1 union
select start_date from Table2 union
select start_date from Table3
) d
left join lateral (
select ends.end_date from ends where ends.end_date > d.start_date
order by end_date
limit 1
) e on true
left join table1 on d.start_date between table1.start_date and coalesce(table1.end_date,current_date)
left join table2 on d.start_date between table2.start_date and coalesce(table2.end_date,current_date)
left join table3 on d.start_date between table3.start_date and coalesce(table3.end_date,current_date)
order by
start_date, end_date
| start_date | end_date | t1_val | t2_val | t3_val |
|----------------------|----------------------|--------|---------|--------|
| 1999-01-12T00:00:00Z | 1999-12-31T00:00:00Z | (null) | (null) | CAR |
| 2000-01-01T00:00:00Z | 2000-01-15T00:00:00Z | APPLE | TOMATO | CAR |
| 2000-01-16T00:00:00Z | 2000-01-31T00:00:00Z | APPLE | LETTUCE | CAR |
| 2000-02-01T00:00:00Z | (null) | ORANGE | LETTUCE | CAR |
I think the other answer is more complicated than it need be. 我认为其他答案比需要的要复杂。
with t as (
select start_date as dte, val as val1, null as val2, null as val3 from table1
union all
select start_date as dte, null as val1, val as val2, null as val3 from table2
union all
select start_date as dte, null as val1, null as val2, val as val3 from table3
),
tt as ( -- eliminate duplicates
select dte, max(val1) as val1, max(val2) as val2, max(val3) as val3
from t
)
select dte as start_date,
lead(dte) over (order by dte) - interval '1 day' as end_date,
coalesce(val1,
first_value(val1) over (order by (val1 is not null)::int desc, dte desc)
) as val1,
coalesce(val2,
first_value(val2) over (order by (val2 is not null)::int desc, dte desc)
) as val2,
coalesce(val3,
first_value(val3) over (order by (val3 is not null)::int desc, dte desc)
) as val3
from tt
order by start_date;
This is using some a key observation from the data: the time frames are "tiled" within each table. 这是对数据的一些重要观察:时间范围在每个表中“平铺”。 They do not overlap and they do not have gaps.
它们不重叠且没有间隙。 For the purposes of this version, the values are also not-
NULL
(handling NULL
values just makes the query look a bit more complicated but the logic is the same). 就此版本而言,值也不是
NULL
(处理NULL
值只会使查询看起来有些复杂,但逻辑是相同的)。 If you need help with that, then please ask another question. 如果您需要帮助,请询问其他问题。
How does this work? 这是如何运作的?
t
shows what happens in each table at each start_date
. t
显示每个start_date
在每个表中发生的情况。 tt
handles duplicates on a given date. tt
处理给定日期的重复项。 This provides each row of the result table. lag(ignore nulls)
, but Postgres does not support it. lag(ignore nulls)
,但是Postgres不支持它。 Instead this uses a first_value()
equivalent. first_value()
等效项。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.