简体   繁体   中英

Aggregate date ranges without including overlapping dates

I have a table storing records of vehicles being loaned out. A sample of the data looks like:

ASSET_ID CATEGORY LOANED_TO LOAN_START LOAN_END
00001 Keyed Cordon 01/01/2022 2:00 PM 01/03/2022 8:00 AM
00002 Keyless JIP 01/15/2022 8:30 AM 01/15/2022 10:00 AM
00002 Keyless Cordon 01/15/2022 09:30 AM 01/15/2022 10:30 AM
00002 Keyless Craft 01/15/2022 12:00 PM 01/15/2022 06:00 PM

As you can see from the sample above. Sometimes an asset can be loaned for multiple days (eg Asset ID: 0001 from 01/01/2022-01/03/2022) and sometimes the asset can be loaned for hours, including overlapping times (I know this seems like an issue but this is how the data is and overlpaping hours should not be double accounted.

Using the sample data above, Asset ID 0001 should look like the following:

DATE ASSET_ID PCT_UNAVAILABLE
01/01/2022 00001 .42
01/02/2022 00001 1
01/03/2022 00001 .33
01/15/2022 00002 .33

Using the intial data sample this can be translated as follows:

- Asset ID 00001  was unavilable for 10 hours (2 PM - midnight) on 01/01/2022 - 42 %
- Asset ID 00001  was unavailable for 24 hours (midnight-midnight) on 01/02/2022 - 100%
- Asset ID 00001  was unavilable for 8 hours (midnight-8 AM) on 01/03/2022 - 33%
- Asset ID 00002  was unavailable for 8 hours (08:30 -10:30; 12:00-06:00) on 01/15/2022 - 33%

I was able to create a date table ( CALENDAR ) that has a record for each date in a particular date range. This table, joined with the table showing loaned assets, will let me create daily records instead of ranged records; however, I have no idea how to aggregate the time.

My approach so far is to explode the records into daily records and calculate the unavilability percentage by each record, like below:

SELECT 
       CAL.DATE
      ,LA.ASSET_ID
      ,CASE
          WHEN loan_end - loan_start >= 1 then 1 else loan_end - loan_start
       END           PCT_UNAVAILABLE
FROM   CALENDAR      CAL
JOIN   LOANED_ASSETS LA  on CAL.DATE >= LA.LOAN_START and CAL.DATE <= LA.LOAN_END

however, if I approach it this way, I am still not aggregating all non-overlapping hours in the day for all records. Any and all help would be greatly appreciated.

You can use:

WITH days ( asset_id, loan_start, day_end, loan_end) AS (
  SELECT asset_id,
         loan_start,
         LEAST(TRUNC(loan_start) + INTERVAL '1' DAY, loan_end),
         loan_end
  FROM   table_name
  MATCH_RECOGNIZE (
    PARTITION BY asset_id
    ORDER     BY loan_start, loan_end
    MEASURES
     FIRST(loan_start) AS loan_start,
     MAX(loan_end)     AS loan_end
   ONE ROW PER MATCH
   PATTERN (overlaps* next_time)
   DEFINE
     overlaps  AS MAX(overlaps.loan_end) >= loan_start,
     next_time AS MAX(overlaps.loan_end) >= loan_start
               OR MAX(overlaps.loan_end) IS NULL
   )
UNION ALL
  SELECT asset_id,
         day_end,
         LEAST(TRUNC(day_end) + INTERVAL '1' DAY, loan_end),
         loan_end
  FROM   days
  WHERE  day_end < loan_end
)
SEARCH DEPTH FIRST BY asset_id, loan_start SET order_id
SELECT asset_id,
       TO_CHAR(TRUNC(loan_start), 'YYYY-MM-DD') AS day,
       SUM(day_end - loan_start) AS amt_unavailable
FROM   days
GROUP BY
       asset_id,
       TRUNC(loan_start)

Which, for the sample data:

CREATE TABLE table_name (ASSET_ID, LOAN_START, LOAN_END) AS
SELECT '00001', DATE '2022-01-01' + INTERVAL '14:00' HOUR TO MINUTE, DATE '2022-01-03' + INTERVAL '08:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT '00002', DATE '2022-01-15' + INTERVAL '08:30' HOUR TO MINUTE, DATE '2022-01-15' + INTERVAL '10:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT '00002', DATE '2022-01-15' + INTERVAL '09:30' HOUR TO MINUTE, DATE '2022-01-15' + INTERVAL '10:30' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT '00002', DATE '2022-01-15' + INTERVAL '12:00' HOUR TO MINUTE, DATE '2022-01-15' + INTERVAL '18:00' HOUR TO MINUTE FROM DUAL;

Outputs:

ASSET_ID DAY AMT_UNAVAILABLE
00001 2022-01-01 .4166666666666666666666666666666666666667
00001 2022-01-02 1
00001 2022-01-03 .3333333333333333333333333333333333333333
00002 2022-01-15 .3333333333333333333333333333333333333333

db<>fiddle here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM