I have the data in the following format in my database:
Name Values Start_of_week End_of_week
Name1 1_2_2_1_1_2_1 22-Dec-19 28-Dec-19
Name1 1_2_2_1_2_2_1 29-Dec-19 04-Jan-20
Name1 1_2_2_2_2_2_1 05-Jan-20 11-Jan-20
Name1 1_2_2_2_2_2_1 12-Jan-20 18-Jan-20
Name1 1_2_2_2_2_2_1 19-Jan-20 25-Jan-20
Name1 1_2_2_2_2_2_1 26-Jan-20 01-Feb-20
Name1 1_2_2_2_2_2_1 02-Feb-20 08-Feb-20
Name1 1_2_2_2_2_2_1 09-Feb-20 15-Feb-20
Name1 1_2_2_2_2_2_1 16-Feb-20 22-Feb-20
Name1 1_2_2_2_2_2_1 23-Feb-20 29-Feb-20
Name1 1_2_2_2_2_2_1 01-Mar-20 07-Mar-20
Name2 1_2_2_1_1_2_1 22-Dec-19 28-Dec-19
Name2 1_2_2_2_2_2_2 29-Dec-19 04-Jan-20
Name2 1_2_2_2_2_2_2 05-Jan-20 11-Jan-20
Name2 1_2_2_2_2_2_2 12-Jan-20 18-Jan-20
Name2 1_2_2_2_2_2_2 19-Jan-20 25-Jan-20
Name2 1_2_2_2_2_2_2 26-Jan-20 01-Feb-20
Name2 1_2_2_2_2_2_2 02-Feb-20 08-Feb-20
Name2 1_2_2_2_2_2_2 09-Feb-20 15-Feb-20
Name2 1_2_2_2_2_2_2 16-Feb-20 22-Feb-20
Name2 1_2_2_2_2_2_2 23-Feb-20 29-Feb-20
Name2 1_2_2_2_2_2_2 01-Mar-20 07-Mar-20
I need the values column to be compared for each name column and update the end_of_week. For example, first row and second row has different values column so no need to update the end_of_week column. Third and fourth column has same values column, so the fourth row's end of week should be updated for the third row resulting like this.
Then this row should be compared with the next row and if the values column is same, then end_of_week should be taken from the next row and updated in this row. This should be happening for every set of rows of each name column value.
I tried to compare the rows using the lead()
function but unable to compare with the next set of rows after update.
Name Values start_of_week end_of_week
Name1 1_2_2_1_1_2_1 22-Dec-19 28-Dec-19
Name1 1_2_2_1_2_2_1 29-Dec-19 04-Jan-20
Name1 1_2_2_2_2_2_1 05-Jan-20 07-Mar-20
Name2 1_2_2_1_1_2_1 22-Dec-19 28-Dec-19
Name2 1_2_2_2_2_2_2 29-Dec-19 07-Mar-20
This is a gaps-and-islands problem. A simple solution is the difference of row numbers:
select name, value,
min(week_start), max(week_end)
from (select t.*,
row_number() over (partition by name order by week_start) as seqnum,
row_number() over (partition by name, value order by week_start) as seqnum_2
from t
) t
group by name, value, (seqnum - seqnum_2);
Why this works is a little tricky to explain. But if you look at the results of the subquery, you will see how the difference of row numbers identifies adjacent rows with the same values.
Looking at sample data, I think It is not a gap and island problem. You can achieve the desired output using group by
.
Select name, value,
Min(start_of_week),
Max(end_of_week)
From your_table
Group by name, value;
Cheers!!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.