简体   繁体   中英

SQL-Oracle Update table using recursivity

Lets assume we have a table like this one:

+--------+------------+---------------+----------------+
| Name   | Position   | Initial Date  | Final Date     |
+--------+------------+---------------+----------------+
| XXX    | 1          | 2016/06/07    | 2016/06/08     |
| XXX    | 2          | 2016/06/08    | 2016/06/09     |
| XXX    | 3          | 2016/06/09    | 2016/06/10     |
| XXX    | 4          | 2016/06/13    | 2016/06/14     |
| XXX    | 6          | 2016/06/14    | 2016/06/15     |
| YYY    | 1          | 2016/06/02    | 2016/06/03     |
+--------+------------+---------------+----------------+

I want to update it adding a new field which indicates the first position of a group. Forming part of a group means that follows these rules:

  1. Share the same name
  2. Position numbers must be correlatives (Ex: Position 4 and 6 need a number 5 to create a group).
  3. The final date of first row must coincide with the initial date of the second one, and so on.

Having all of this in consideration, this should be the outcome:

+--------+------------+---------------+----------------+------------+
| Name   | Position   | Initial Date  | Final Date     | New field  |
+--------+------------+---------------+----------------+------------+
| XXX    | 1          | 2016/06/07    | 2016/06/08     | 1          |
| XXX    | 2          | 2016/06/08    | 2016/06/09     | 1          |
| XXX    | 3          | 2016/06/09    | 2016/06/10     | 1          |
| XXX    | 4          | 2016/06/13    | 2016/06/14     | 4          |
| XXX    | 6          | 2016/06/14    | 2016/06/15     | 6          |
| YYY    | 1          | 2016/06/02    | 2016/06/03     | 1          |
+--------+------------+---------------+----------------+------------+

I can make it work only on groups of 2 members, but I do not know how to approach it in a more than 2 members situation.

This is an example code I used, which obviously does not work for big groups.

update table1 f1
set f1.new_field = NVL((select f2.position
                    from table1 f2
                    where f1.name = f2.name and
                    f2.position = f1.position+1 and
                    f1.final_date = f2.initial_date),f1.position);

Should I use recursive queries to solve this? I don't know how to implement it in SQL in this situation.

Any help is well appreciated!

You can do this using a series of analytic functions, like so:

with sample_data as (select 'XXX' name, 1 position, to_date('07/06/2016', 'dd/mm/yyyy') initial_date, to_date('08/06/2016', 'dd/mm/yyyy') final_date from dual union all
                     select 'XXX' name, 2 position, to_date('08/06/2016', 'dd/mm/yyyy') initial_date, to_date('09/06/2016', 'dd/mm/yyyy') final_date from dual union all
                     select 'XXX' name, 3 position, to_date('09/06/2016', 'dd/mm/yyyy') initial_date, to_date('10/06/2016', 'dd/mm/yyyy') final_date from dual union all
                     select 'XXX' name, 4 position, to_date('13/06/2016', 'dd/mm/yyyy') initial_date, to_date('14/06/2016', 'dd/mm/yyyy') final_date from dual union all
                     select 'XXX' name, 6 position, to_date('14/06/2016', 'dd/mm/yyyy') initial_date, to_date('15/06/2016', 'dd/mm/yyyy') final_date from dual union all
                     select 'YYY' name, 1 position, to_date('02/06/2016', 'dd/mm/yyyy') initial_date, to_date('03/06/2016', 'dd/mm/yyyy') final_date from dual)
-- end of mimicking a table called "sample_data" containing your data
select name,
       position,
       initial_date,
       final_date,
       min(position) over (partition by name, grp_sum) new_field
from   (select name,
               position,
               initial_date,
               final_date,
               sum(change_grp_required) over (partition by name order by position) grp_sum
        from   (select name,
                       position,
                       initial_date,
                       final_date,
                       case when position - lag(position, 1, position) over (partition by name order by position) != 1
                                 or initial_date != lag(final_date, 1, initial_date - 1) over (partition by name order by position) then 1
                            else 0
                       end change_grp_required
                from   sample_data));

NAME   POSITION INITIAL_DATE FINAL_DATE  NEW_FIELD
---- ---------- ------------ ---------- ----------
XXX           1 2016/06/07   2016/06/08          1
XXX           2 2016/06/08   2016/06/09          1
XXX           3 2016/06/09   2016/06/10          1
XXX           4 2016/06/13   2016/06/14          4
XXX           6 2016/06/14   2016/06/15          6
YYY           1 2016/06/02   2016/06/03          1

The innermost subquery determines whether the position and dates of the current and previous row are correlated. If they aren't, then it puts 1, otherwise it puts 0.

The next subquery then calculates a running sum across these numbers - this has the effect of creating the same number for correlated rows (eg. 1 for positions 1 to 3, 2 for position 4 and 3 for position 6) which we can then use to group against.

The outer query then simply finds the minimum position number per name and the newly created grouping column.

You could then use this query in your update statement to do the actual update (obviously, you wouldn't need the initial sample_data subquery, as you'd just use your table_name in the rest of the query directly).

You can use the LAG() and LAST_VALUE() analytic function to get the initial position for each group and then use MERGE (instead of UPDATE ) to update the table.

Oracle Setup :

CREATE TABLE table_name ( Name, Position, Initial_Date, Final_Date ) AS
SELECT 'XXX', 1, DATE '2016-06-07', DATE '2016-06-08' FROM DUAL UNION ALL
SELECT 'XXX', 2, DATE '2016-06-08', DATE '2016-06-09' FROM DUAL UNION ALL
SELECT 'XXX', 3, DATE '2016-06-09', DATE '2016-06-10' FROM DUAL UNION ALL
SELECT 'XXX', 4, DATE '2016-06-13', DATE '2016-06-14' FROM DUAL UNION ALL
SELECT 'XXX', 6, DATE '2016-06-14', DATE '2016-06-15' FROM DUAL UNION ALL
SELECT 'YYY', 1, DATE '2016-06-02', DATE '2016-06-03' FROM DUAL;

ALTER TABLE table_name ADD new_field INT;

Update Query :

MERGE INTO table_name d
USING (
        SELECT LAST_VALUE( start_of_group ) IGNORE NULLS
                 OVER ( PARTITION BY Name ORDER BY position )
                 AS new_field
        FROM   (
          SELECT name,
                 position,
                 CASE WHEN position - 1 = LAG( position   )
                                            OVER ( PARTITION BY NAME
                                                   ORDER BY position )
                      AND  initial_date = LAG( final_date )
                                            OVER ( PARTITION BY NAME
                                                   ORDER BY position )
                      THEN NULL
                      ELSE position
                      END AS start_of_group
          FROM   table_name t
        )
      ) s
      ON ( d.ROWID = s.ROWID )
WHEN MATCHED THEN
  UPDATE SET new_field = s.new_field;

Output :

SELECT * FROM table_name;

NAME   POSITION INITIAL_DATE        FINAL_DATE           NEW_FIELD
---- ---------- ------------------- ------------------- ----------
XXX           1 2016-06-07 00:00:00 2016-06-08 00:00:00          1 
XXX           2 2016-06-08 00:00:00 2016-06-09 00:00:00          1 
XXX           3 2016-06-09 00:00:00 2016-06-10 00:00:00          1 
XXX           4 2016-06-13 00:00:00 2016-06-14 00:00:00          4 
XXX           6 2016-06-14 00:00:00 2016-06-15 00:00:00          6 
YYY           1 2016-06-02 00:00:00 2016-06-03 00:00:00          1 

You can do this with window functions.

select t.*, min(position) over (partition by name, grp) as new_field
from (select t.*,
             sum(case when (prev_position = position - 1) and
                           (prev_final_date = initial_date)
                       then 0 else 1
                 end) over (partition by name) as grp
      from (select t.*,
                   lag(position) over (partition by name order by position) as prev_position,
                   lag(final_date) over (partition by name order by position) as prev_final_date
            from t
           ) t
     ) t;

The basic idea is to determine if a new group starts. This firsts uses lag() to get the data in the "previous" row. I am guessing that "previous" is based on the position (rather than the initial_date ).

Then, a flag is created when a group starts -- "1" for a new group, "0" if not. The cumulative sum of this flag identifies a group.

The outermost query simply assigns the minimum position in the group as the new field.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM