How to merge SQL rows -- only when ALL rows meet criteria?

Question

I have a table (Oracle SQL) containing details of a list of item prices at each store location. I want to combine several rows into one -- but only when ALL rows for an item meet the criteria: the item is the same price at all locations.

The data.table (simplified) looks like this:

list_id, item_id, location_id, item_price
1        1        1            1.99
1        1        2            1.99
1        1        3            1.99
1        2        1            3.99
1        2        2            3.99
1        2        3            3.99
1        3        1            5.99
1        3        2            7.99
1        3        3            8.99

...and I want this:

list_id, item_id, location_id, item_price
1        1        0            1.99
1        2        0            3.99
1        3        1            5.99
1        3        2            7.99
1        3        3            8.99

Rows for items 1 and 2 have been combined into a single row each, with location set to zero(all). Rows for item 3 have remained unchanged because the price was not the same in ALL locations.

This query helps me to identify when an item doesn't need to be merged (two rows exist with the same item_id):

select count(list_id), item_id, item_price 
from list_detail
group by item_id, item_price

...but I can't wrap my head around how it would fit into a larger trigger, script, or whatever which would identify and combine rows.

NOTE: I cannot change the structure of the table because it is relied on by many, many other processes.

How would you best identify and then combine rows where the price is the same in all locations? A script, trigger, scheduled console app?

Answer 1

One option uses window functions, then distinct :

select distinct list_id, item_id, location_id, item_price
from (
    select list_id, item_id, item_price,
        case when min(item_price) over(partition by list_id, item_id) = max(item_price) over(partition by list_id, item_id) 
                then 0
                else location_id
            end location_id
    from mytable t
) t

The basic idea is to compare the minimum and the maximum price in groups having the same list_id and item_id . When they are equal, then we know we have just one distinct value in the group, so we turn the location_id to 0 , else we keep it as it is. All that is left to do is then to keep distinct values.

Answer 2

Since you must update some rows and delete others in a single statement, it's best to use a merge statement, which is exactly for this purpose.

The s (ource) rowset is the result of an aggregation - to identify the (list_id, item_id) that must be modified.

Note that I assume the price is never null ; if it can be null , you must explain how that should be handled.

There will be solutions offered using analytic functions. If efficiency (speed) is important, the solution below will be better; aggregation is much faster than analytic functions, when both do the same job.

merge into sample_data t
  using (
          select list_id, item_id, min(location_id) as min_loc_id
          from   sample_data
          group  by list_id, item_id
          having min(item_price) = max(item_price)
        ) s
    on (t.list_id = s.list_id and t.item_id = s.item_id)
when matched then
  update
    set   t.location_id = case when t.location_id = s.min_loc_id then 0 end
  delete
    where t.location_id is null
;

Rows from the target (which is your base table) will only be affected when they match the source by list_id, item_id ; other rows will be left unchanged. (These unchanged rows are the rows for items where the price is not the same at all locations - so the corresponding list_id, item_id does not appear in the source.)

The update part will change the first location id to 0 and all the others to null . Then the delete part will delete all the rows where the location id is null . In this step, the location id is the modified one, after the update part did its work. So all the rows except one, for affected location_id, item_id , will be deleted by the delete step.

Answer 3

Hmmm. . . .You can use window functions:

select list_id, item_id,
       (case when min_price = max_price then 0
             else location_id
        end) as location_id,
       price
from (select t.*,
             min(price) over (partition by list_id, item_id) as min_price,
             max(price) over (partition by list_id, item_id) as max_price
      from t
     ) t
group by list_id, item_id,
         (case when min_price = max_price then 0
               else location_id
          end), price;

Another method would use exists and union all :

select list_id, item_id, location_id, price
from t
where exists (select 1
              from t t2
              where t2.list_id = t.list_id and
                    t2.item_id = t.item_id and
                    t2.price <> t.price
             )
union all
select list_id, item_id, 0, max(price)
from t
group by list_id, item_id
having min(price) = max(price);

Answer 4

You can use a MERGE statement to both UPDATE and DELETE and, if you use analytic functions to identify the affected rows then you can correlate the merge using the ROWID pseudo-column which can perform a self-join (more efficiently than by comparing values):

MERGE INTO table_name dst
USING (
  SELECT ROWID AS rid,
         rn
  FROM   (
    SELECT ROW_NUMBER() OVER ( PARTITION BY list_id, item_id ORDER BY location_id )
             AS rn,
           MIN( item_price ) OVER ( PARTITION BY list_id, item_id ) AS min_price,
           MAX( item_price ) OVER ( PARTITION BY list_id, item_id ) AS max_price
    FROM   table_name
  )
  WHERE min_price = max_price
) src
ON ( src.rid = dst.ROWID )
WHEN MATCHED THEN
  UPDATE SET location_id = 0
  DELETE WHERE src.rn > 1;

Which, for the sample data:

CREATE TABLE table_name ( list_id, item_id, location_id, item_price ) AS
SELECT 1, 1, 1, 1.99 FROM DUAL UNION ALL
SELECT 1, 1, 2, 1.99 FROM DUAL UNION ALL
SELECT 1, 1, 3, 1.99 FROM DUAL UNION ALL
SELECT 1, 2, 1, 3.99 FROM DUAL UNION ALL
SELECT 1, 2, 2, 3.99 FROM DUAL UNION ALL
SELECT 1, 2, 3, 3.99 FROM DUAL UNION ALL
SELECT 1, 3, 1, 5.99 FROM DUAL UNION ALL
SELECT 1, 3, 2, 7.99 FROM DUAL UNION ALL
SELECT 1, 3, 3, 8.99 FROM DUAL;

Updates 2 rows and deletes 4 rows leaving the table as:

 LIST_ID | ITEM_ID | LOCATION_ID | ITEM_PRICE ------: |  ------: |  ----------: |  ---------: 1 |  1 |  0 |  1.99 1 |  2 |  0 |  3.99 1 |  3 |  1 |  5.99 1 |  3 |  2 |  7.99 1 |  3 |  3 |  8.99

db<>fiddle here

If you can have NULL values for item_price then the query can be extended to only filter when rows are all non- NULL or are all NULL :

MERGE INTO table_name dst
USING (
  SELECT ROWID AS rid,
         rn
  FROM   (
    SELECT ROW_NUMBER() OVER ( PARTITION BY list_id, item_id ORDER BY location_id )
             AS rn,
           MIN( item_price ) OVER ( PARTITION BY list_id, item_id ) AS min_price,
           MAX( item_price ) OVER ( PARTITION BY list_id, item_id ) AS max_price,
           COUNT(item_price)
             OVER ( PARTITION BY list_id, item_id ) AS num_non_null,
           COUNT(*)
             OVER ( PARTITION BY list_id, item_id ) AS num_locations
    FROM   table_name
  )
  WHERE ( min_price = max_price AND num_non_null = num_locations )
  OR    ( num_non_null = 0 )
) src
ON ( src.rid = dst.ROWID )
WHEN MATCHED THEN
  UPDATE SET location_id = 0
  DELETE WHERE src.rn > 1;

Which, for the sample data:

CREATE TABLE table_name ( list_id, item_id, location_id, item_price ) AS
SELECT 1, 1, 1, 1.99 FROM DUAL UNION ALL
SELECT 1, 1, 2, 1.99 FROM DUAL UNION ALL
SELECT 1, 1, 3, 1.99 FROM DUAL UNION ALL
SELECT 1, 2, 2, 3.99 FROM DUAL UNION ALL
SELECT 1, 2, 3, 3.99 FROM DUAL UNION ALL
SELECT 1, 2, 4, 3.99 FROM DUAL UNION ALL
SELECT 1, 3, 1, 5.99 FROM DUAL UNION ALL
SELECT 1, 3, 2, 7.99 FROM DUAL UNION ALL
SELECT 1, 3, 3, 8.99 FROM DUAL UNION ALL
SELECT 1, 4, 8, 1.99 FROM DUAL UNION ALL
SELECT 1, 4, 9, NULL FROM DUAL UNION ALL
SELECT 1, 5, 1, NULL FROM DUAL UNION ALL
SELECT 1, 5, 2, NULL FROM DUAL;

Outputs:

 LIST_ID | ITEM_ID | LOCATION_ID | ITEM_PRICE ------: |  ------: |  ----------: |  ---------: 1 |  1 |  0 |  1.99 1 |  2 |  0 |  3.99 1 |  3 |  1 |  5.99 1 |  3 |  2 |  7.99 1 |  3 |  3 |  8.99 1 |  4 |  8 |  1.99 1 |  4 |  9 |  null 1 |  5 |  0 |  null

db<>fiddle here

How to merge SQL rows -- only when ALL rows meet criteria?

Question

4 answers

solution1
2 2020-10-26 22:06:36

solution2
2 2020-10-26 22:34:18

solution3
0 2020-10-26 22:03:34

solution4
0 2020-10-26 23:06:22

How to merge SQL rows -- only when ALL rows meet criteria?

Question

4 answers

solution1 2 2020-10-26 22:06:36

solution2 2 2020-10-26 22:34:18

solution3 0 2020-10-26 22:03:34

solution4 0 2020-10-26 23:06:22

solution1
2 2020-10-26 22:06:36

solution2
2 2020-10-26 22:34:18

solution3
0 2020-10-26 22:03:34

solution4
0 2020-10-26 23:06:22