简体   繁体   中英

Oracle: Split delimited string and pick date greater than the input date

I have a comma demilited string as a value in a column like below.

'2015/04/01 11 GG, 2015/08/03 78 KK, 2012/12/12 44 TT, 2015/09/01 77 YY, 2015/09/01 88 ZZ'

Within that, each string has three columns combined and delimited by space.

So the requirement here is to pick the date greater and closer to the given input date and get the 2nd column.

Example: If the input date is 01-AUG-2015 then my output should be 78 since its the closer one. If no dates greater than the input date then output should be empty.

This is a complex requirement, which, as commented by Gordon Linoff, would be much simpler to solve if data was properly spread already.

Here is an approach :

  • First use a recursive CTE with REGEXP_SUBSTR and CONNECT BY to split the string into rows, using the comma separataor
  • Then split each row into 3 columns, again with REGEXP_SUBSTR and the space separator
  • Then use oracle window functions DENSE_RANK and KEEP to isolate the relevant row

Assuming that data comes from column str in table my_table :

WITH 
    cte0 AS (
        SELECT TRIM(REGEXP_SUBSTR(str, '[^,]+', 1, LEVEL)) str
        FROM my_table
        CONNECT BY INSTR(str, ',', 1, LEVEL - 1) > 0
    ),
    cte1 AS (
        SELECT 
            TO_DATE(REGEXP_SUBSTR(str, '\S+', 1, 1), 'yyyy-mm-dd') dt,
            REGEXP_SUBSTR(str, '\S+', 1, 2) val1,
            REGEXP_SUBSTR(str, '\S+', 1, 3) val2
        FROM cte0
        ORDER BY 1 DESC
    )
SELECT 
    MIN(dt)   keep (dense_rank first order by dt) as dt,
    MIN(val1) keep (dense_rank first order by dt) as val1,
    MIN(val2) keep (dense_rank first order by dt) as val2
FROM cte1
WHERE dt > TO_DATE(?, 'yyyy-mm-dd')

... where ? is the input date.

* db<>fiddle here

 with 
     data as  (
         SELECT
             '2015/04/01 11 GG, 2015/08/03 78 KK, 2012/12/12 44 TT, 2015/09/01 77 YY, 2015/09/01 88 ZZ' str
         FROM DUAL
     ),
     cte0 AS (
         SELECT TRIM(REGEXP_SUBSTR(str, '[^,]+', 1, LEVEL)) str
         FROM data
         CONNECT BY INSTR(str, ',', 1, LEVEL - 1) > 0
     ),
     cte1 AS (
         SELECT 
             TO_DATE(REGEXP_SUBSTR(str, '\S+', 1, 1), 'yyyy-mm-dd') dt,
             REGEXP_SUBSTR(str, '\S+', 1, 2) val1,
             REGEXP_SUBSTR(str, '\S+', 1, 3) val2
         FROM cte0
         ORDER BY 1 DESC
     )
 SELECT 
     min(dt) keep (dense_rank first order by dt) as dt,
     min(val1) keep (dense_rank first order by dt) as val1,
     min(val2) keep (dense_rank first order by dt) as val2
 FROM cte1
 WHERE dt > TO_DATE('2015-08-01', 'yyyy-mm-dd')


-------------------------
 DT        | VAL1 | VAL2
 :-------- | :--- | :---
 03-AUG-15 | 78   | KK  

Here's one option, based on sample data you provided.

SQL> with test (col) as
  2    (select '2015/04/01 11 GG, 2015/08/03 78 KK, 2012/12/12 44 TT, 2015/09/01 77 YY, 2015/09/01 88 ZZ' from dual),
  3  t_comma as
  4    (select trim(regexp_substr(col, '[^,]+', 1, level)) col2
  5     from test
  6     connect by level <= regexp_count(col, ',') + 1
  7    ),
  8  t_diff as
  9    (select col2,
 10         substr(col2, 1, 10) c_date,
 11         regexp_substr(col2, '\d+', 1, 4) c_num,
 12         regexp_substr(col2, '\w+$') c_let ,
 13         --
 14         abs(to_date(substr(col2, 1, 10), 'yyyy/mm/dd') -
 15             to_date('&&:par_date', 'yyyy/mm/dd')) diff_days,
 16         --
 17         row_number() over (order by abs(to_date(substr(col2, 1, 10), 'yyyy/mm/dd') -
 18                                         to_date('&&par_date', 'yyyy/mm/dd'))) rn
 19     from t_comma
 20    )
 21  select c_num
 22  from t_diff
 23  where rn = 1;
Enter value for par_date: 2015-08-01

C_NUM
--------------------------------------------------------------------------------
78

SQL>

What does it do?

  • TEST is your sample table (represented by a CTE)
  • T_COMMA splits comma-separated values string into rows (so you get 5 rows out of sample data)
  • T_DIFF extract each part of the sub-string (ie each row), calculates the difference between sample date and parametrized date, ranks them by absolute value of the difference - RN = 1 is the "closest" date
  • the final SELECT just returns that "closest" value

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM