简体   繁体   English

如何计算表 SQL Vertica 中到和之后的天数?

[英]How to calculate days to and after in table SQL Vertica?

I have SQL table like below:我有 SQL 表,如下所示:

date        col1
2020-01-02  xxx

And I have special dates like: 2020-01-01, 2020-01-05, 2020-05-10我有特殊的日期2020-01-01, 2020-01-05, 2020-05-10
And I need to calculate number of days to the nearest special date and number of days after last special date, so i need a result like below:我需要计算到最近的特殊日期的天数和最后一个特殊日期之后的天数,所以我需要如下结果:

next_special = 3 because the nearest special date for 2020-01-02 is 2020-01-05 (3 days) next_special = 3 因为最接近 2020-01-02 的特殊日期是 2020-01-05(3 天)
last_special = 1 because last special date for 2020-01-02 was 1 day ago (2020-01-01) last_special = 1 因为 2020-01-02 的最后一个特殊日期是 1 天前 (2020-01-01)

date        col1  next_special  last_special
2020-01-02  xxx   3             1

If I understand correctly, you can use complex case expressions along with least() and greatest() :如果我理解正确,您可以使用复杂的case表达式以及least()greatest()

select t.*,
       nullif( least(case when '2020-01-01' < date then datediff(day, '2020-01-01', date) else 999999 end,
                     case when '2020-01-05' < date then datediff(day, '2020-01-05', date) else 999999 end,
                     case when '2020-05-10' < date then datediff(day, '2020-05-10', date) else 999999 end
                    ), 999999
             ) as prev_special,
       nullif( least(case when '2020-01-01' > date then datediff(day, date, '2020-01-01') else 999999 end,
                     case when '2020-01-05' > date then datediff(day, date, '2020-01-05') else 999999 end,
                     case when '2020-05-10' > date then datediff(day, date, '2020-05-10') else 999999 end
                    ), 999999
             ) as next_special
from t;

I think this would be simpler to express in almost any other database -- because they support correlated scalar subqueries in the SELECT clause.我认为这在几乎任何其他数据库中都更容易表达——因为它们支持SELECT子句中的相关标量子查询。

EDIT:编辑:

If you have the values in a separate table and you have a unique id on your original table, you could use:如果您在单独的表中有值,并且在原始表中有唯一的 id,则可以使用:

select t.date, t.col1,
       min(case when d.date < t.date then datediff(day, d.date, t.date end) as prev_special,
       min(case when d.date > t.date then datediff(day, t.date, d.date end) as next_special
from t cross join
     dates d
group by t.date, t.col1;

Easier to express.更容易表达。 Much worse performance-wise.性能方面要差得多。

Vertica works with the event series join , which is obtained using the INTERPOLATE PREVIOUS VALUE predicate in a LEFT JOIN clause. Vertica 使用事件系列 join ,它是使用 LEFT JOIN 子句中的INTERPOLATE PREVIOUS VALUE谓词获得的。 Instead of NULLS in a LEFT JOIN, you get the data from the immediately preceding row.您可以从前一行获取数据,而不是 LEFT JOIN 中的 NULLS。

And Vertica is pretty good at OLAP functions - and at pipeline parallelism, so that nesting several queries usually hurts a bit less. Vertica 非常擅长 OLAP 函数 - 以及管道并行性,因此嵌套多个查询通常会少一些伤害。

In the example below, I create a series of 15 consecutive dates out of the two limit dates, from 1st Jan to 15th Jan, add 'xxx' as col1 in indata , and join with a specdays table (after enriching each row with its successor date), then apply an event series join, and to the maths.在下面的示例中,我在两个限制日期中创建了一系列 15 个连续日期,从 1 月 1 日到 1 月 15 日,在indata中添加 'xxx' 作为 col1,并加入一个specdays表(在使用其后继丰富每一行之后日期),然后应用事件系列连接和数学。

Is that what you're after?这就是你所追求的吗?

WITH
specdays(dt) AS (
          SELECT DATE '2020-01-01' -- new year's day
UNION ALL SELECT DATE '2020-01-03' -- sunday
UNION ALL SELECT DATE '2020-01-06' -- epiphany
UNION ALL SELECT DATE '2020-01-10' -- sunday
)
,
prevnext AS (
  SELECT
    dt
, LEAD(dt) OVER w AS nextdt
FROM specdays
WINDOW w AS (ORDER BY dt)
)
,
-- list of dates beween 1st Jan and 15th Jan
indata AS (
SELECT
  tms::DATE AS dt
, 'xxx'     AS col1
FROM (
          SELECT TIMESTAMP '2020-01-01' 
UNION ALL SELECT TIMESTAMP '2020-01-15' 
) limits(dt)
TIMESERIES tms AS '1 DAY' OVER(ORDER BY dt)
)
SELECT 
  indata.* 
, prevnext.dt     AS prevspecday
, prevnext.nextdt AS nextspecday
, TIMESTAMPDIFF('DAY',prevnext.dt    ,indata.dt  ) AS last_special
, CASE WHEN prevnext.dt = indata.dt
    THEN 0
    ELSE TIMESTAMPDIFF('DAY',indata.dt  ,prevnext.nextdt)
  END AS next_special
FROM indata
LEFT JOIN prevnext ON indata.dt INTERPOLATE PREVIOUS VALUE prevnext.dt
;

dt        |col1|prevspecday|nextspecday|last_special|next_special
2020-01-01|xxx |2020-01-01 |2020-01-03 |           0|           0
2020-01-02|xxx |2020-01-01 |2020-01-03 |           1|           1
2020-01-03|xxx |2020-01-03 |2020-01-06 |           0|           0
2020-01-04|xxx |2020-01-03 |2020-01-06 |           1|           2
2020-01-05|xxx |2020-01-03 |2020-01-06 |           2|           1
2020-01-06|xxx |2020-01-06 |2020-01-10 |           0|           0
2020-01-07|xxx |2020-01-06 |2020-01-10 |           1|           3
2020-01-08|xxx |2020-01-06 |2020-01-10 |           2|           2
2020-01-09|xxx |2020-01-06 |2020-01-10 |           3|           1
2020-01-10|xxx |2020-01-10 |-          |           0|           0
2020-01-11|xxx |2020-01-10 |-          |           1|-
2020-01-12|xxx |2020-01-10 |-          |           2|-
2020-01-13|xxx |2020-01-10 |-          |           3|-
2020-01-14|xxx |2020-01-10 |-          |           4|-
2020-01-15|xxx |2020-01-10 |-          |           5|-

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM