简体   繁体   English

Postgres重叠的日期范围供查看

[英]Postgres overlapping date ranges for view

I have three tables 我有三张桌子

    Table 1              Table 2             Table 3
   start_date          start_date          start_date
   end_date            end_date             end_date
    val                   val                  val

Now lets say I have the following in the tables: 现在让我们说表中有以下内容:

Table 1
start_date    end_date     val
01-01-2000     31-01-2000   APPLE
01-02-2000                  ORANGE

table 2
start_date    end_date     val
01-01-2000     15-01-2000   TOMATO
16-01-2000                  LETTUCE

table 3
start_date    end_date     val
01-12-1999                  CAR

I want the above three tables put into a view with min/max dates. 我想将以上三个表放入具有最小/最大日期的视图中。 Which would look like this: 看起来像这样:

start_date     end_date      val_table_1    val_table_2    val_table_3
01-12-1999     31-12-1999       null           null           CAR
01-01-2000     15-01-2000       APPLE          TOMATO         CAR
16-01-2000     31-01-2000       APPLE          LETTUCE        CAR
01-02-2000                      ORANGE         LETTUCE        CAR

I was able to achieve the wanted result with the query below. 我可以通过以下查询实现所需的结果。 Also available as a demo here at SQL Fiddle . SQL Fiddle上也可以作为演示。 Leave out the final order by clause if creating a view, it was included here just to present the results sensibly. 如果创建视图,则省略最后的order by子句,此处仅包含此子句是order by合理地显示结果。

PostgreSQL 9.6 Schema Setup : PostgreSQL 9.6模式设置

CREATE TABLE Table1
    ("start_date" timestamp, "end_date" timestamp, "val" varchar(20))
;

INSERT INTO Table1
    ("start_date", "end_date", "val")
VALUES
    ('2000-01-01 00:00:00', '2000-01-31', 'APPLE'),
    ('2000-02-01 00:00:00', NULL, 'ORANGE')
;

CREATE TABLE Table2
    ("start_date" timestamp, "end_date" timestamp, "val" varchar(20))
;

INSERT INTO Table2
    ("start_date", "end_date", "val")
VALUES
    ('2000-01-01', '2000-01-15', 'TOMATO'),
    ('2000-01-16', NULL, 'LETTUCE')
;

CREATE TABLE Table3
    ("start_date" timestamp, "end_date" timestamp, "val" varchar(3))
;

INSERT INTO Table3
    ("start_date", "end_date", "val")
VALUES
    ('1999-01-12 00:00:00', NULL, 'CAR')
;

Query 1 : 查询1

with ends as (
          select end_date from Table1 where end_date is not null union
          select end_date from Table2 where end_date is not null union
          select end_date from Table3 where end_date is not null
          )
select
       d.start_date
     , least(e.end_date, lead(d.start_date,1) over(order by d.start_date) - INTERVAL '1 DAY') as end_date
     , table1.val as t1_val
     , table2.val as t2_val
     , table3.val as t3_val
from (
    select start_date from Table1 union
    select start_date from Table2 union
    select start_date from Table3
    ) d
left join lateral (
  select ends.end_date from ends where ends.end_date > d.start_date
  order by end_date
  limit 1
  ) e on true
left join table1 on d.start_date between table1.start_date and coalesce(table1.end_date,current_date)
left join table2 on d.start_date between table2.start_date and coalesce(table2.end_date,current_date)
left join table3 on d.start_date between table3.start_date and coalesce(table3.end_date,current_date)
order by
       start_date, end_date

Results : 结果

|           start_date |             end_date | t1_val |  t2_val | t3_val |
|----------------------|----------------------|--------|---------|--------|
| 1999-01-12T00:00:00Z | 1999-12-31T00:00:00Z | (null) |  (null) |    CAR |
| 2000-01-01T00:00:00Z | 2000-01-15T00:00:00Z |  APPLE |  TOMATO |    CAR |
| 2000-01-16T00:00:00Z | 2000-01-31T00:00:00Z |  APPLE | LETTUCE |    CAR |
| 2000-02-01T00:00:00Z |               (null) | ORANGE | LETTUCE |    CAR |

I think the other answer is more complicated than it need be. 我认为其他答案比需要的要复杂。

with t as (
      select start_date as dte, val as val1, null as val2, null as val3 from table1
      union all
      select start_date as dte, null as val1, val as val2, null as val3 from table2
      union all
      select start_date as dte, null as val1, null as val2, val as val3 from table3
     ),
     tt as (  -- eliminate duplicates
      select dte, max(val1) as val1, max(val2) as val2, max(val3) as val3
      from t
     )
select dte as start_date,
       lead(dte) over (order by dte) - interval '1 day' as end_date,
       coalesce(val1,
                first_value(val1) over (order by (val1 is not null)::int desc, dte desc)
               ) as val1,
       coalesce(val2,
                first_value(val2) over (order by (val2 is not null)::int desc, dte desc)
               ) as val2,
       coalesce(val3,
                first_value(val3) over (order by (val3 is not null)::int desc, dte desc)
               ) as val3
from tt
order by start_date;

This is using some a key observation from the data: the time frames are "tiled" within each table. 这是对数据的一些重要观察:时间范围在每个表中“平铺”。 They do not overlap and they do not have gaps. 它们不重叠且没有间隙。 For the purposes of this version, the values are also not- NULL (handling NULL values just makes the query look a bit more complicated but the logic is the same). 就此版本而言,值也不是NULL (处理NULL值只会使查询看起来有些复杂,但逻辑是相同的)。 If you need help with that, then please ask another question. 如果您需要帮助,请询问其他问题。

How does this work? 这是如何运作的?

  • t shows what happens in each table at each start_date . t显示每个start_date在每个表中发生的情况。
  • tt handles duplicates on a given date. tt处理给定日期的重复项。 This provides each row of the result table. 这提供了结果表的每一行。
  • Finally, what you really want is lag(ignore nulls) , but Postgres does not support it. 最后,您真正想要的是lag(ignore nulls) ,但是Postgres不支持它。 Instead this uses a first_value() equivalent. 而是使用first_value()等效项。

Here is a rextester . 这是一个学期

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM