简体   繁体   中英

Oracle SQL to compare a column in a row with previous row in the same column

This is how my table(Table1) is currently in Oracle database.

ID   Year_Mth   Product    
123  201901     1,2,3      
123  201902     1,2,4,5    
123  201903     2,3,4,6    
123  201904     1,4,5,6  

I am trying to get an output that compares Product column for every row to get something like below: Here, I am comparing Row 1 with Row 2 to see if Row 2 has New Products (NEW_PRODUCTS) that were not there in Row 1.

It seems that I can either use LAG, or LEAD function but it seems tricky because of , delimiters between products.

ID   Year_Mth   Product    New_Products 
123  201901     1,2,3      1,2,3        
123  201902     1,2,4,5    4,5           
123  201903     2,3,4,6    3,6           
123  201904     1,4,5,6    1,5        

Here's one option. Looks as ugly as your data model:) See comments within code. If you're unsure of what each CTE does, I suggest you run the following code step-by-step and review its results.

For readability, I'll split it to several parts.

SQL> with
  2  test (id, year_mth, product) as
  3    -- your sample data (as well as some of my sample data)
  4    (select 123, 201901, '1,2,3'   from dual union all
  5     select 123, 201902, '1,2,4,5' from dual union all
  6     select 123, 201903, '2,3,4,6' from dual union all
  7     select 123, 201904, '1,4,5,6' from dual union all
  8     --
  9     select 888, 201901, 'apple,banana' from dual union all
 10     select 888, 201902, 'apple,banana' from dual union all
 11     select 888, 201903, 'apple,lemon'  from dual
 12    ),
 13  py as
 14    (select id,
 15            year_mth ymp,                                                -- "this" year_mth
 16            lead(year_mth) over (partition by id order by year_mth) ymn  -- "next" year_mth
 17     from test
 18     order by id, year_mth
 19    ),
 20  tabp as
 21    -- products that belong to "THIS" year_mth split to rows
 22    (select
 23        t.id,
 24        t.year_mth,
 25        p.ymp,
 26        p.ymn,
 27        regexp_substr(t.product, '[^,]+', 1, c.column_value) product
 28      from test t join py p on t.id = p.id and t.year_mth = p.ymp cross join
 29        table(cast(multiset(select level from dual
 30                            connect by level <= regexp_count(product, ',') + 1
 31                           ) as sys.odcinumberlist)) c
 32    ),
 33  tabn as
 34    -- products that belong to "NEXT" year_mth split to rows
 35    (select
 36        t.id,
 37        t.year_mth,
 38        p.ymp,
 39        p.ymn,
 40        regexp_substr(t.product, '[^,]+', 1, c.column_value) product
 41      from test t join py p on t.id = p.id and t.year_mth = p.ymn cross join
 42        table(cast(multiset(select level from dual
 43                            connect by level <= regexp_count(product, ',') + 1
 44                           ) as sys.odcinumberlist)) c
 45    ),
  •  46 newprod as 47 -- MINUS set operator finds differences between "NEXT" and "THIS" year_mth 48 (select id, ymn, product from tabn 49 minus 50 select id, ymn, product from tabp 51 ) 52 -- finally, aggregate new products (result of the previous MINUS set operation) 53 select 54 t.id, 55 t.year_mth, 56 t.product, 57 listagg(case when t.rn = 1 then t.product else n.product end, ',') 58 within group (order by n.product) new_products 59 from (select a.id, 60 a.year_mth, 61 a.product, 62 row_number() over (partition by a.id order by a.year_mth) rn 63 from test a 64 ) t left join newprod n on t.id = n.id and t.year_mth = n.ymn 65 group by t.id, t.year_mth, t.product 66 order by t.id, t.year_mth;
  •  ID YEAR_MTH PRODUCT NEW_PRODUCTS

     123 201901 1,2,3 1,2,3 123 201902 1,2,4,5 4,5 123 201903 2,3,4,6 3,6 123 201904 1,4,5,6 1,5 888 201901 apple,banana apple,banana 888 201902 apple,banana 888 201903 apple,lemon lemon

    7 rows selected.

    SQL>

In cases when you need to work with such delimited strings, it's often very convenient to use xml-functions, such as fn:string-join(), fn:tokenize().

For example:

xmltable(
       'let $x:=tokenize($a,","), $y:=tokenize($b,",")
        return fn:string-join($x[not(.=$y)],",")'
       passing product as "a"
              ,prev_product as "b"
     columns New_Products varchar(100) path '.'
    ) x

This xmltable() splits input parameters product and prev_product and returns those substrings from product that are not in prev_product:

  1. Function tokenize($a, ",") splits input string $a using comma as a delimiter.
  2. $x[not(.=$y)] returns those values from $x that do not exist in $y
  3. Function string-join($arg1, ",") concatenates values from $arg1 using comma as a delimiter.

Full example:

with
test (id, year_mth, product) as
  -- your sample data (as well as some of my sample data)
  (select 123, 201901, '1,2,3'   from dual union all
   select 123, 201902, '1,2,4,5' from dual union all
   select 123, 201903, '2,3,4,6' from dual union all
   select 123, 201904, '1,4,5,6' from dual union all
   --
   select 888, 201901, 'apple,banana' from dual union all
   select 888, 201902, 'apple,banana' from dual union all
   select 888, 201903, 'apple,lemon'  from dual
  )
select
    t.*
   ,x.*
from 
    (
     select 
        t.*
       ,lag(t.product)over(partition by id order by year_mth) prev_product
     from test t
    ) t
    ,xmltable(
       'let $x:=tokenize($a,","), $y:=tokenize($b,",")
        return fn:string-join($x[not(.=$y)],",")'
       passing product as "a"
              ,prev_product as "b"
     columns New_Products varchar(100) path '.'
    ) x;

I made the xquery above so long just to make it more readable. In real life xquery would be much shorter: fn:string-join(tokenize($a,",")[not(.=tokenize($b,","))],",")

with
test (id, year_mth, product) as
  -- your sample data (as well as some of my sample data)
  (select 123, 201901, '1,2,3'   from dual union all
   select 123, 201902, '1,2,4,5' from dual union all
   select 123, 201903, '2,3,4,6' from dual union all
   select 123, 201904, '1,4,5,6' from dual union all
   --
   select 888, 201901, 'apple,banana' from dual union all
   select 888, 201902, 'apple,banana' from dual union all
   select 888, 201903, 'apple,lemon'  from dual
  )
select
    t.*
   ,x.*
from 
    (
     select 
        t.*
       ,lag(t.product)over(partition by id order by year_mth) prev_product
     from test t
    ) t
    ,xmltable(
       'fn:string-join(tokenize($a,",")[not(.=tokenize($b,","))],",")'
       passing product as "a"
              ,prev_product as "b"
     columns New_Products varchar(100) path '.'
    ) x

Mine is similar, add a listagg and group-by query at the end if you want to re-pivot...

WITH
input(id,year_mth,product) AS (
          SELECT 123,201901,'1,2,3'    FROM dual
UNION ALL SELECT 123,201902,'1,2,4,5'  FROM dual
UNION ALL SELECT 123,201903,'2,3,4,6'  FROM dual
UNION ALL SELECT 123,201904,'1,4,5,6'  FROM dual
)
,
i(i) AS (
           SELECT 1 FROM dual
 UNION ALL SELECT 2 FROM dual
 UNION ALL SELECT 3 FROM dual
 UNION ALL SELECT 4 FROM dual
 UNION ALL SELECT 5 FROM dual
)
,
unpivot AS (
  SELECT
    id
  , i
  , year_mth
  , REGEXP_SUBSTR(product,'\d+',1,i) AS prd
  FROM input CROSS JOIN i
  WHERE REGEXP_SUBSTR(product,'\d+',1,i) <> ''
)
SELECT 
  * 
, CASE 
    WHEN LAG(year_mth) OVER(PARTITION BY id,prd ORDER BY year_mth) IS NULL
    THEN 'new'
    ELSE 'old'
   END
FROM unpivot ORDER BY 3,4;
-- out  id  | i | year_mth | prd | case 
-- out -----+---+----------+-----+------
-- out  123 | 1 |   201901 | 1   | new
-- out  123 | 2 |   201901 | 2   | new
-- out  123 | 3 |   201901 | 3   | new
-- out  123 | 1 |   201902 | 1   | old
-- out  123 | 2 |   201902 | 2   | old
-- out  123 | 3 |   201902 | 4   | new
-- out  123 | 4 |   201902 | 5   | new
-- out  123 | 1 |   201903 | 2   | old
-- out  123 | 2 |   201903 | 3   | old
-- out  123 | 3 |   201903 | 4   | old
-- out  123 | 4 |   201903 | 6   | new
-- out  123 | 1 |   201904 | 1   | old
-- out  123 | 2 |   201904 | 4   | old
-- out  123 | 3 |   201904 | 5   | old
-- out  123 | 4 |   201904 | 6   | old

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM