Lets assume we have a table like this one:
+--------+------------+---------------+----------------+
| Name | Position | Initial Date | Final Date |
+--------+------------+---------------+----------------+
| XXX | 1 | 2016/06/07 | 2016/06/08 |
| XXX | 2 | 2016/06/08 | 2016/06/09 |
| XXX | 3 | 2016/06/09 | 2016/06/10 |
| XXX | 4 | 2016/06/13 | 2016/06/14 |
| XXX | 6 | 2016/06/14 | 2016/06/15 |
| YYY | 1 | 2016/06/02 | 2016/06/03 |
+--------+------------+---------------+----------------+
I want to update it adding a new field which indicates the first position of a group. Forming part of a group means that follows these rules:
Having all of this in consideration, this should be the outcome:
+--------+------------+---------------+----------------+------------+
| Name | Position | Initial Date | Final Date | New field |
+--------+------------+---------------+----------------+------------+
| XXX | 1 | 2016/06/07 | 2016/06/08 | 1 |
| XXX | 2 | 2016/06/08 | 2016/06/09 | 1 |
| XXX | 3 | 2016/06/09 | 2016/06/10 | 1 |
| XXX | 4 | 2016/06/13 | 2016/06/14 | 4 |
| XXX | 6 | 2016/06/14 | 2016/06/15 | 6 |
| YYY | 1 | 2016/06/02 | 2016/06/03 | 1 |
+--------+------------+---------------+----------------+------------+
I can make it work only on groups of 2 members, but I do not know how to approach it in a more than 2 members situation.
This is an example code I used, which obviously does not work for big groups.
update table1 f1
set f1.new_field = NVL((select f2.position
from table1 f2
where f1.name = f2.name and
f2.position = f1.position+1 and
f1.final_date = f2.initial_date),f1.position);
Should I use recursive queries to solve this? I don't know how to implement it in SQL in this situation.
Any help is well appreciated!
You can do this using a series of analytic functions, like so:
with sample_data as (select 'XXX' name, 1 position, to_date('07/06/2016', 'dd/mm/yyyy') initial_date, to_date('08/06/2016', 'dd/mm/yyyy') final_date from dual union all
select 'XXX' name, 2 position, to_date('08/06/2016', 'dd/mm/yyyy') initial_date, to_date('09/06/2016', 'dd/mm/yyyy') final_date from dual union all
select 'XXX' name, 3 position, to_date('09/06/2016', 'dd/mm/yyyy') initial_date, to_date('10/06/2016', 'dd/mm/yyyy') final_date from dual union all
select 'XXX' name, 4 position, to_date('13/06/2016', 'dd/mm/yyyy') initial_date, to_date('14/06/2016', 'dd/mm/yyyy') final_date from dual union all
select 'XXX' name, 6 position, to_date('14/06/2016', 'dd/mm/yyyy') initial_date, to_date('15/06/2016', 'dd/mm/yyyy') final_date from dual union all
select 'YYY' name, 1 position, to_date('02/06/2016', 'dd/mm/yyyy') initial_date, to_date('03/06/2016', 'dd/mm/yyyy') final_date from dual)
-- end of mimicking a table called "sample_data" containing your data
select name,
position,
initial_date,
final_date,
min(position) over (partition by name, grp_sum) new_field
from (select name,
position,
initial_date,
final_date,
sum(change_grp_required) over (partition by name order by position) grp_sum
from (select name,
position,
initial_date,
final_date,
case when position - lag(position, 1, position) over (partition by name order by position) != 1
or initial_date != lag(final_date, 1, initial_date - 1) over (partition by name order by position) then 1
else 0
end change_grp_required
from sample_data));
NAME POSITION INITIAL_DATE FINAL_DATE NEW_FIELD
---- ---------- ------------ ---------- ----------
XXX 1 2016/06/07 2016/06/08 1
XXX 2 2016/06/08 2016/06/09 1
XXX 3 2016/06/09 2016/06/10 1
XXX 4 2016/06/13 2016/06/14 4
XXX 6 2016/06/14 2016/06/15 6
YYY 1 2016/06/02 2016/06/03 1
The innermost subquery determines whether the position and dates of the current and previous row are correlated. If they aren't, then it puts 1, otherwise it puts 0.
The next subquery then calculates a running sum across these numbers - this has the effect of creating the same number for correlated rows (eg. 1 for positions 1 to 3, 2 for position 4 and 3 for position 6) which we can then use to group against.
The outer query then simply finds the minimum position number per name and the newly created grouping column.
You could then use this query in your update
statement to do the actual update (obviously, you wouldn't need the initial sample_data
subquery, as you'd just use your table_name in the rest of the query directly).
You can use the LAG()
and LAST_VALUE()
analytic function to get the initial position for each group and then use MERGE
(instead of UPDATE
) to update the table.
Oracle Setup :
CREATE TABLE table_name ( Name, Position, Initial_Date, Final_Date ) AS
SELECT 'XXX', 1, DATE '2016-06-07', DATE '2016-06-08' FROM DUAL UNION ALL
SELECT 'XXX', 2, DATE '2016-06-08', DATE '2016-06-09' FROM DUAL UNION ALL
SELECT 'XXX', 3, DATE '2016-06-09', DATE '2016-06-10' FROM DUAL UNION ALL
SELECT 'XXX', 4, DATE '2016-06-13', DATE '2016-06-14' FROM DUAL UNION ALL
SELECT 'XXX', 6, DATE '2016-06-14', DATE '2016-06-15' FROM DUAL UNION ALL
SELECT 'YYY', 1, DATE '2016-06-02', DATE '2016-06-03' FROM DUAL;
ALTER TABLE table_name ADD new_field INT;
Update Query :
MERGE INTO table_name d
USING (
SELECT LAST_VALUE( start_of_group ) IGNORE NULLS
OVER ( PARTITION BY Name ORDER BY position )
AS new_field
FROM (
SELECT name,
position,
CASE WHEN position - 1 = LAG( position )
OVER ( PARTITION BY NAME
ORDER BY position )
AND initial_date = LAG( final_date )
OVER ( PARTITION BY NAME
ORDER BY position )
THEN NULL
ELSE position
END AS start_of_group
FROM table_name t
)
) s
ON ( d.ROWID = s.ROWID )
WHEN MATCHED THEN
UPDATE SET new_field = s.new_field;
Output :
SELECT * FROM table_name;
NAME POSITION INITIAL_DATE FINAL_DATE NEW_FIELD
---- ---------- ------------------- ------------------- ----------
XXX 1 2016-06-07 00:00:00 2016-06-08 00:00:00 1
XXX 2 2016-06-08 00:00:00 2016-06-09 00:00:00 1
XXX 3 2016-06-09 00:00:00 2016-06-10 00:00:00 1
XXX 4 2016-06-13 00:00:00 2016-06-14 00:00:00 4
XXX 6 2016-06-14 00:00:00 2016-06-15 00:00:00 6
YYY 1 2016-06-02 00:00:00 2016-06-03 00:00:00 1
You can do this with window functions.
select t.*, min(position) over (partition by name, grp) as new_field
from (select t.*,
sum(case when (prev_position = position - 1) and
(prev_final_date = initial_date)
then 0 else 1
end) over (partition by name) as grp
from (select t.*,
lag(position) over (partition by name order by position) as prev_position,
lag(final_date) over (partition by name order by position) as prev_final_date
from t
) t
) t;
The basic idea is to determine if a new group starts. This firsts uses lag()
to get the data in the "previous" row. I am guessing that "previous" is based on the position
(rather than the initial_date
).
Then, a flag is created when a group starts -- "1" for a new group, "0" if not. The cumulative sum of this flag identifies a group.
The outermost query simply assigns the minimum position in the group as the new field.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.