简体   繁体   中英

SQL Query to Get Valid Data on Given Date

I have database that tracked person salary over time like table below:

数据库基于

I want to query the person (based on id) salary each month to give output like the table below结果查询

I don't know what query to use since it needs to iterate in the salary database to check what is the valid salary for certain date. Any idea for this?

Thanks!

Here you have all with the example data. As already shown, you need a valid-from-date...

WITH
-- your input ...
indata(id,datevaliduntil,salary) AS (
          SELECT 1001,DATE '9999-12-31', 5000
UNION ALL SELECT 1001,DATE '2020-08-31', 4000
UNION ALL SELECT 1001,DATE '2020-04-30', 3000
)
,
-- make it almost like a slowly changing dimension
-- table - ad a valid-from-date ...
scd AS (
  SELECT
    id
  , LAG(datevaliduntil,1,DATE '1900-01-01') OVER (
      PARTITION BY id ORDER BY datevaliduntil
    ) AS datevalidfrom
  , datevaliduntil
  , salary
  FROM indata
)
,
-- the months from the example ...
months(monthend) AS (
  SELECT 
    mon::DATE - 1 AS monthend
  FROM 
  GENERATE_SERIES(
    '2020-04-01'::DATE
  , '2021-03-01'::DATE
  , INTERVAL '1 MONTH'
  ) gs(mon)
)
SELECT
  monthend
, id
, salary
FROM scd
JOIN months ON monthend >  datevalidfrom
           AND monthend <= datevaliduntil
ORDER BY 1
;
-- out   monthend  |  id  | salary 
-- out ------------+------+--------
-- out  2020-03-31 | 1001 |   3000
-- out  2020-04-30 | 1001 |   3000
-- out  2020-05-31 | 1001 |   4000
-- out  2020-06-30 | 1001 |   4000
-- out  2020-07-31 | 1001 |   4000
-- out  2020-08-31 | 1001 |   4000
-- out  2020-09-30 | 1001 |   5000
-- out  2020-10-31 | 1001 |   5000
-- out  2020-11-30 | 1001 |   5000
-- out  2020-12-31 | 1001 |   5000
-- out  2021-01-31 | 1001 |   5000
-- out  2021-02-28 | 1001 |   5000

This is a convenient place to use a lateral join. The following goes by the first day of the month rather than the last day -- because that is simpler to generate:

select i.id, gs.mon, s.salary
from generate_series('2019-01-01'::date, '2020-12-01'::date, interval '1 month') gs(mon) cross join
    (select distinct id from salaries) i left join lateral
    (select s.salary
     from salaries s
     where s.id = i.id and s.datevaliduntil >= gs.mon
     order by s.datevaliduntil asc
     limit 1
    ) s;

Of course, you can just subtract 1 day from each date if you want the last day.

I would use a lateral join, but the other way around: start from the table itself, bring the previous date with lag() , then use generate series to generate the dates in between. A little bit of additional logic is needed to adjust the end of months:

select x.date - interval '1 day' date, t.id, t.salary
from (
    select id, salary,
        datevaliduntil + interval '1 day' datevaliduntil, 
        lag(datevaliduntil, 1, datevaliduntil) 
            over(partition by id order by datevaliduntil) + interval '1 day' lag_datevaliduntil 
    from mytable t
) t
cross join lateral generate_series(
    t.lag_datevaliduntil, 
    least(t.datevaliduntil, '2021-03-01'),
    '1 month'
) x(date)

You control the overall upper bound with the literal date in the second argument to generate_series (here, you want to stop end of March 2021).

Demo on DB Fiddle :

date                |   id | salary
:------------------ | ---: | -----:
2020-04-30 00:00:00 | 1001 |   3000
2020-04-30 00:00:00 | 1001 |   4000
2020-05-31 00:00:00 | 1001 |   4000
2020-06-30 00:00:00 | 1001 |   4000
2020-07-31 00:00:00 | 1001 |   4000
2020-08-31 00:00:00 | 1001 |   4000
2020-08-31 00:00:00 | 1001 |   5000
2020-09-30 00:00:00 | 1001 |   5000
2020-10-31 00:00:00 | 1001 |   5000
2020-11-30 00:00:00 | 1001 |   5000
2020-12-31 00:00:00 | 1001 |   5000
2021-01-31 00:00:00 | 1001 |   5000
2021-02-28 00:00:00 | 1001 |   5000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM