简体   繁体   中英

Oracle SQL data modification

I have a table like the one shown below:

 RowID |  ID |    col    | Label  |  
-----------------------------------
   1   |  1  |    Bad    |    N   |  
-----------------------------------
   2   |  1  | Very Good |    N   |  <---|
 -----------------------------------
   3   |  1  |Super Good |    N   |  <---| Either Label to be 'Y' (Contain 'Good')
-----------------------------------
   4   |  1  |  Too Bad  |    Y   |  <---$ Need this one to be 'N' (Contain 'Bad')
-----------------------------------
   5   |  2  | F*** Good |    N   | 
-----------------------------------
   6   |  2  | So Good   |    Y   | 
-----------------------------------
   7   |  2  | Really Bad|    Y   |  <---$ Need both to be 'N' (Contain 'Bad')
-----------------------------------
   8   |  2  |    Bad    |    Y   |  <---$ 
-----------------------------------
   9   |  3  |    Good   |    N   | 
-----------------------------------
   10  |  3  |    Good   |    Y   | 
-----------------------------------
   11  |  3  |  Bad Bad  |    N   | 

col has two types of content which can be distinguished by key word Good and Bad . And RowID does not exist in my real table. Here I just use RowID for better description of my problem.

My goal:

For each group of the same ID , I want to change all Label of the Bad related rows to N (if there is any Y ), and change the Label of at least one row whose col contains Good to Y (if all associated label are N )

For example:

For the group where ID = 1 ( RowID = 1 to 4 ), I want Label of RowID = 4 to be N instead of Y while change the Label of either RowID = 2 or RowID = 3 to Y .

For the group where ID = 2 ( RowID = 5 to 8 ), since RowID = 6 is already Y I just want to put the Label of both RowID = 7 and 8 to N .

For the group where ID = 3 ( RowID = 9 to 11 ), since the Label of RowID = 10 is Y and RowID = 11 is N . I do not need to modify this group.

I was thinking about using cursor in PL SQL but it should be very inefficient... after all I have 1.8 billion rows... Could anyone please show me how to write this in ORACLE SQL? Thank you in advance!

I think this gets the result you want:

select t.*,
  case when col like '%Bad%' then 'N'
       when max(case when col like '%Good%' then label else 'N' end)
         over (partition by id) = 'Y'
       then label
       when row_number()
         over (partition by id, case when col like '%Good%' then 'Y' end order by null) = 1
         then 'Y'
       else label
  end as new_label
from your_table t
order by rn;

        RN         ID COL        LABEL NEW_LABEL
---------- ---------- ---------- ----- ---------
         1          1 Bad        N     N        
         2          1 Very Good  N     Y        
         3          1 Super Good N     N        
         4          1 Too Bad    Y     N        
         5          2 F*** Good  N     N        
         6          2 So Good    Y     Y        
         7          2 Really Bad Y     N        
         8          2 Bad        Y     N        
         9          3 Good       N     N        
        10          3 Good       Y     Y        
        11          3 Bad Bad    N     N        

(I'm not using rn , I've just left that in as a dummy column for ordering the result set).

This picks 'N' for any value containing Bad; otherwise it's a Good value, and if any Good value for the ID is already Y then the flags for all Good values for that ID are left alone; otherwise a semi-random Good value is set to Y.

Updating/merging your table based on this will be difficult as there isn't a single key to correlate against, at least in what you've shown, unless - maybe - the values are unique within an ID. Or there is another key column you haven't shown.


I do have a unique ID column (using sys_guid())

You can use that in a merge, and to make the choice of which of rows 2 and 3 become Y slightly less arbitrary:

merge into your_table target
using (
  select t.*,
    case when col like '%Bad%' then 'N'
         when max(case when col like '%Good%' then label else 'N' end)
           over (partition by id) = 'Y' then label
         when row_number()
           over (partition by id, case when col like '%Good%' then 'Y' end
             order by guid_col) = 1 then 'Y'
         else label
    end as new_label
  from your_table t
) source on (source.guid_col = target.guid_col)
when matched then
  update set target.label = source.new_label
  where target.label is null or target.label != source.new_label
/

4 rows merged.

The where clause is so you only update the rows where the label has actually changed, rather than updating every row in the table.

Also, if this is something you're going to have to recalculate regularly as data is modified, it might be better to use a view that always calculates the label on the fly; though you could then see the 'Y' moving around within an ID when it wouldn't with this method.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM