[英]how to do gap filling and interpolation based on date range in postgresql/vertica?
I have a day wise data (num) against some dimension (ie cnt and cnt_id) I want to interpolate date, dimensions(ie cnt and cnt_id) as well as cumulative_num 我有一个针对某个维度(例如cnt和cnt_id)的每日数据(数字),我想插入日期,维度(例如cnt和cnt_id)以及cumulative_num
my input set has data for only 3-dates, and I have fixed date range against which I want to do gap-fill 我的输入集仅包含3个日期的数据,并且我有固定的日期范围,我想针对该日期范围进行填充
fixed date-range = from 2017-01-01 to 2017-01-08 固定的日期范围=从2017-01-01至2017-01-08
Ref. 参考 SQL to generate data
SQL生成数据
WITH temp_data AS (
SELECT '2017-01-03'::DATE AS e_date, 'uk'::VARCHAR AS cnt, 1::int AS cnt_id, 10::int AS numbers, 10::int AS cumulative_num
UNION
SELECT '2017-01-05'::DATE AS e_date, 'uk'::VARCHAR AS cnt, 1::int AS cnt_id, 20::int AS numbers, 30::int AS cumulative_num
UNION
SELECT '2017-01-07'::DATE AS e_date, 'uk'::VARCHAR AS cnt, 1::int AS cnt_id, 40::int AS numbers, 70::int AS cumulative_num
UNION
SELECT '2017-01-03'::DATE AS e_date, 'fr'::VARCHAR AS cnt, 2::int AS cnt_id, 100::int AS numbers, 100::int AS cumulative_num
UNION
SELECT '2017-01-05'::DATE AS e_date, 'fr'::VARCHAR AS cnt, 2::int AS cnt_id, 200::int AS numbers, 300::int AS cumulative_num
UNION
SELECT '2017-01-07'::DATE AS e_date, 'fr'::VARCHAR AS cnt, 2::int AS cnt_id, 500::int AS numbers, 800::int AS cumulative_num
)
SELECT * FROM temp_data ORDER BY cnt_id, e_date
My input data is like following 我的输入数据如下
e_date cnt cnt_id numbers cumulative_num
---------- --- ------ ------- --------------
2017-01-03 uk 1 10 10
2017-01-05 uk 1 20 30
2017-01-07 uk 1 40 70
2017-01-03 fr 2 100 100
2017-01-05 fr 2 200 300
2017-01-07 fr 2 500 800
... .. .. .. ...
My expected result is like following 我的预期结果如下
e_date cnt cnt_id num cumulative_num
---------- --- ------ --- --------------
2017-01-01 uk 1 0 0
2017-01-02 uk 1 0 0
2017-01-03 uk 1 10 10
2017-01-04 uk 1 0 10
2017-01-05 uk 1 20 30
2017-01-06 uk 1 0 30
2017-01-07 uk 1 40 70
2017-01-08 uk 1 0 70
2017-01-01 fr 2 0 0
2017-01-02 fr 2 0 0
2017-01-03 fr 2 100 100
2017-01-04 fr 2 0 100
2017-01-05 fr 2 200 300
2017-01-06 fr 2 0 300
2017-01-07 fr 2 500 800
2017-01-08 fr 2 0 800
Note: I am tagging both postgresql and vertica as they both follow almost same sql syntax standards. 注意:我正在标记postgresql和vertica,因为它们都遵循几乎相同的sql语法标准。 solutions in any of the db is preferable.
在任何数据库中的解决方案都是可取的。
I think this is what are you looking for - gives exactly what you show as desired output - at least you can use it as starting point for your query. 我认为这就是您要寻找的-准确地提供所需的输出-至少可以将其用作查询的起点。 Because I think cumulative_num should actually be calculated not taken from temp data:
因为我认为不应该从temp数据中获取accumulated_num,所以:
WITH temp_data AS (
SELECT '2017-01-03'::DATE AS e_date, 'uk'::VARCHAR AS cnt, 1::int AS cnt_id, 10::int AS numbers, 10::int AS cumulative_num
UNION
SELECT '2017-01-05'::DATE AS e_date, 'uk'::VARCHAR AS cnt, 1::int AS cnt_id, 20::int AS numbers, 30::int AS cumulative_num
UNION
SELECT '2017-01-07'::DATE AS e_date, 'uk'::VARCHAR AS cnt, 1::int AS cnt_id, 40::int AS numbers, 70::int AS cumulative_num
UNION
SELECT '2017-01-03'::DATE AS e_date, 'fr'::VARCHAR AS cnt, 2::int AS cnt_id, 100::int AS numbers, 100::int AS cumulative_num
UNION
SELECT '2017-01-05'::DATE AS e_date, 'fr'::VARCHAR AS cnt, 2::int AS cnt_id, 200::int AS numbers, 300::int AS cumulative_num
UNION
SELECT '2017-01-07'::DATE AS e_date, 'fr'::VARCHAR AS cnt, 2::int AS cnt_id, 500::int AS numbers, 800::int AS cumulative_num
)
select e_date, cnt, cnt_id, numbers, max(cumulative_num) over (partition by cnt_id order by e_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as cumulative_num
from (
SELECT t.my_date::date as e_date, c.cnt, c.cnt_id, coalesce(tmp.numbers,0) as numbers, coalesce(tmp.cumulative_num, 0) as cumulative_num
FROM generate_series('2017-01-01'::date, '2017-01-08'::date, '1day'::interval) as t(my_date)
cross join (select distinct cnt, cnt_id from temp_data) c
left join temp_data tmp on t.my_date=tmp.e_date and c.cnt_id=tmp.cnt_id
ORDER BY cnt_id, e_date
) src
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.