简体   繁体   中英

How would I extract a substring from a string and create a new numeric column in Oracle SQL

I have a column (DFI6) where values looks like this and I have a millions of them. I am trying to create a new NUMERIC column with the COD (COD1,COD2....COD5) values.

IMAGE OF THE TABLE THIS IS THE WHAT IT SHOULD LOOK LIKE, I HAVE THE FIRST COLUMN AND I AM TRYING TO CREATE 5 NEW ONES

Right now I am at the beginning stage, where I am trying to create individual queries and testing them out, rather than one big nested query.

Sometimes the value for COD5 is a literal "Nothing". My query to extract COD5

 select DFI6, (case when DFI6 like '%COD5%' THEN (case when REGEXP_LIKE(SUBSTR(SUBSTR(DFI6, INSTR(DFI6,'COD5=')),6),'[az]') then '-1' else (nvl(SUBSTR(SUBSTR(DFI6, INSTR(DFI6,'COD5=')),6),'-1')) end) else '-1' end)+0 --note: I added zero to turn the column into numeric as COD5 from myTable where updated_date>sysdate-300 and DFI6 is not null

Query to extract COD4 is:

 select DFI6, (case when DFI6 like '%COD4%' THEN (case when REGEXP_LIKE(SUBSTR(SUBSTR(DFI6, INSTR(DFI6,'COD4=')),6),'[az]') then '-1' else SUBSTR(SUBSTR(DFI6, INSTR(DFI6, 'COD3'), INSTR(DFI6, '|COD4') - INSTR(DFI6, 'COD3')),6) end) else '-1' end)+0 --note: I added zero to turn the column into numeric as COD4 from myTable where updated_date>sysdate-30 and DFI6 is not null

ISSUE:

  • Some of the DEF6 columns do not contain a certain COD, and if they don't then -1 should be in the relevant COD sub-column.
  • Some of the COD values can be null and if they are then -1 should be in the relevant COD column.
  • When I am adding 0 at the end of the case statement the columns are numeric, but I fail to fetch all the rows. But when I don't add zero the column is non-numeric but fetches all the rows. Basically I am loosing 600-900k rows.
  • I have tried to wrap the whole case statement into to_number, and it did not work.

I am using SQL navigator6.2.1 oracle DB version [19.16.0.0.0] and some of the new functions do not work with my version. I am very new to SQL and would appreciate any help. Thank you

Try this - using just CASE expression and SubStr() function:

WITH      -- Sample data
    tbl  AS
        (
            Select 'COD1=01YTEFG|COD2=22|COD3=33|COD4=21|COD5=22' "DFI6" From Dual Union All
            Select 'COD1=02YTEFG|COD2=25|COD3=37|COD4=27|COD5=28' "DFI6" From Dual Union All
            Select 'COD1=01YTEFG|COD2=22|COD3=33|COD4=21|COD5=22' "DFI6" From Dual Union All
            Select 'COD1=02YTEFG|COD2=25|COD3=37|COD4=27' "DFI6" From Dual Union All        -- NO COD5
            Select 'COD1=02YTEFG|COD2=253|COD3=37|COD4=27' "DFI6" From Dual Union All        -- NO COD5  AND   COD2 HAS THREE NUMBERS
            Select 'COD1=None|COD2=22|COD3=33|COD4=21|COD5=22' "DFI6" From Dual Union All     -- COD1 = None
            Select 'COD1=02YTEFG|COD5=28' "DFI6" From Dual Union All                          -- NO CODES 2, 3 AND 4
            Select 'COD1=01YTEFG|COD2=22|COD3=33|COD4=21|COD5=22' "DFI6" From Dual Union All
            Select 'COD1=02YTEFG|COD2=25|COD3=37|COD4=27|COD5=Nothing' "DFI6" From Dual 
        )
--
--  Main SQL
SELECT
    DFI6,
    CASE 
        WHEN  INSTR(DFI6, 'COD1=') = 0 OR                                               -- element is not here OR
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD1=') + 5, 1)) = 0        -- first char after 'COD1=' is not numeric
        THEN -1   
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD1=') + 7, 1)) > 0 AND    -- third and
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD1=') + 6, 1)) > 0 AND    -- second and
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD1=') + 5, 1)) > 0        -- first char after 'COD1=' are all numerics 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD1=') + 5, 3))                   -- take all three 
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD1=') + 6, 1)) > 0 AND    -- second and
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD1=') + 5, 1)) > 0        -- first char after 'COD1=' are both numerics
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD1=') + 5, 2))                   -- take them both 
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD1=') + 5, 1)) > 0        -- just 1 numeric char after 'COD1='
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD1=') + 5, 1))                   -- take it
    ELSE -1 END "COD1",                                                     -- ELSE -1
    --          ---------------------------------------------
    CASE 
       WHEN  INSTR(DFI6, 'COD2=') = 0 OR 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD2=') + 5, 1)) = 0
        THEN -1   
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD2=') + 7, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD2=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD2=') + 5, 1)) > 0
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD2=') + 5, 3))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD2=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD2=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD2=') + 5, 2))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD2=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD2=') + 5, 1))
    ELSE -1 END "COD2",
    --          ---------------------------------------------
    CASE 
       WHEN  INSTR(DFI6, 'COD3=') = 0 OR 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD3=') + 5, 1)) = 0
        THEN -1   
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD3=') + 7, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD3=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD3=') + 5, 1)) > 0
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD3=') + 5, 3))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD3=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD3=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD3=') + 5, 2))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD3=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD3=') + 5, 1))
    ELSE -1 END "COD3",
    --          ---------------------------------------------
    CASE 
       WHEN  INSTR(DFI6, 'COD4=') = 0 OR 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD4=') + 5, 1)) = 0
        THEN -1   
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD4=') + 7, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD4=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD4=') + 5, 1)) > 0
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD4=') + 5, 3))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD4=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD4=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD4=') + 5, 2))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD4=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD4=') + 5, 1))
    ELSE -1 END "COD4",
    --          ---------------------------------------------
    CASE 
       WHEN  INSTR(DFI6, 'COD5=') = 0 OR 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD5=') + 5, 1)) = 0
        THEN -1   
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD5=') + 7, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD5=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD5=') + 5, 1)) > 0
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD5=') + 5, 3))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD5=') + 6, 1)) > 0 AND 
              INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD5=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD5=') + 5, 2))
        --
        WHEN  INSTR('0123456789', SubStr(DFI6, InStr(DFI6, 'COD5=') + 5, 1)) > 0 
        THEN To_Number(SubStr(DFI6, InStr(DFI6, 'COD5=') + 5, 1))
    ELSE -1 END "COD5"
FROM tbl
--
/*    R e s u l t :
DFI6                                                    COD1       COD2       COD3       COD4       COD5
------------------------------------------------- ---------- ---------- ---------- ---------- ----------
COD1=01YTEFG|COD2=22|COD3=33|COD4=21|COD5=22               1         22         33         21         22 
COD1=02YTEFG|COD2=25|COD3=37|COD4=27|COD5=28               2         25         37         27         28 
COD1=01YTEFG|COD2=22|COD3=33|COD4=21|COD5=22               1         22         33         21         22 
COD1=02YTEFG|COD2=25|COD3=37|COD4=27                       2         25         37         27         -1 
COD1=02YTEFG|COD2=253|COD3=37|COD4=27                      2        253         37         27         -1 
COD1=None|COD2=22|COD3=33|COD4=21|COD5=22                 -1         22         33         21         22 
COD1=02YTEFG|COD5=28                                       2         -1         -1         -1         28 
COD1=01YTEFG|COD2=22|COD3=33|COD4=21|COD5=22               1         22         33         21         22 
COD1=02YTEFG|COD2=25|COD3=37|COD4=27|COD5=Nothing          2         25         37         27         -1
*/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM