I have a table with two columns. X being the unique identifier. I want to get the row number when I partition by column Y only if Z is in consecutive order. For example, I have this table
X Y Z
A 1 1-jan
A 1 2-jan
A 1 3-jan
B 3 1-jan
B 3 2-jan
A 1 5-jan
The result should look like this:
X Y Z rn
A 1 1-jan 1
A 1 2-jan 2
A 1 3-jan 3
B 3 1-jan 1
B 3 2-jan 2
A 1 5-jan 1
The code I am using right now:
select X, Y, Z, ROW_NUMBER() over (partition by Y order by Z) as rn
I am getting this as my result (This is not the result I want):
X Y Z rn
A 1 1-jan 1
A 1 2-jan 2
A 1 3-jan 3
B 3 5-jan 1
B 3 6-jan 2
A 1 5-jan 4 <---- Column Z is not 4-Jan therefore it should be the not be row 4. It should be a new row 1
You first need to create data that can be used to partition your table.
The below uses LAG() to determine if a row is a "new partition", then SUM() OVER () to propagate that flag forward and make a "partition id", then finally uses ROW_NUMBER() with that identifier.
WITH
gap_marker AS
(
SELECT
yourTable.*,
IIF(
LAG(z) OVER (PARTITION BY y ORDER BY z)
=
DATEADD(day, -1, z),
0,
1
)
AS new_date_range
FROM
yourTable
),
date_range_partition AS
(
SELECT
gap_marker.*,
SUM(new_date_range) OVER (PARTITION BY y ORDER BY z) AS date_range_id
FROM
gap_marker
)
SELECT
x, y, z,
ROW_NUMBER() OVER (PARTITION BY y, date_range_id ORDER BY z) AS rn
FROM
date_range_partition
Alternatively, you could calculate an amount to deduct from the current rn
, to reset to 1
when a date is skipped.
WITH
enumerated AS
(
SELECT
yourTable.*,
ROW_NUMBER() OVER (PARTITION BY y ORDER BY z) AS rn,
DATEDIFF(
day,
LAG(z) OVER (PARTITION BY y ORDER BY z),
z
)
AS delta
FROM
yourTable
)
SELECT
x, y, z,
rn - MAX(IIF(delta = 1, 0, rn - 1)) OVER (PARTITION BY y ORDER BY z) AS rn
FROM
enumerated
Finally, you could use DATEDIFF() if your rows are always whole days apart. Window functions can be used to work out what you should compare the current row against, and avoid ROW_NUMBER() altogether.
WITH
check_previous AS
(
SELECT
yourTable.*,
IIF(
LAG(z) OVER (PARTITION BY y ORDER BY z)
=
DATEADD(day, -1, z),
NULL,
z
)
AS new_base_date
FROM
yourTable
)
SELECT
x, y, z,
DATEDIFF(
day,
MAX(new_base_date) OVER (PARTITION BY y ORDER BY z),
z
) + 1
AS rn
FROM
check_previous
Demo of all three; https://dbfiddle.uk/K8x8gOqh
Supposing that column Z is a date column, you could try the following:
SELECT X, Y, Z,
ROW_NUMBER() OVER (PARTITION BY X, GRP ORDER BY Z) AS RN
FROM
(
SELECT *,
DATEDIFF(DAY, ROW_NUMBER() OVER (PARTITION BY X ORDER BY Z), Z) AS GRP
FROM table_name
) T
ORDER BY X, Z
If the Z column datatype is not date, then you may generate the groups of consecutive values as the following:
SELECT X, Y, Z,
ROW_NUMBER() OVER (PARTITION BY X, GRP ORDER BY Z) AS RN
FROM
(
SELECT *,
CAST(SUBSTRING(Z, 0, CHARINDEX('-', Z)) AS INT) -
ROW_NUMBER() OVER (PARTITION BY X ORDER BY SUBSTRING(Z, CHARINDEX('-', Z)+1, LEN(Z)), CAST(SUBSTRING(Z, 0, CHARINDEX('-', Z)) AS INT)) AS GRP
FROM table_name2
) T
ORDER BY X, MONTH(SUBSTRING(Z, CHARINDEX('-', Z)+1, LEN(Z))+' 1 1'), CAST(SUBSTRING(Z, 0, CHARINDEX('-', Z)) AS INT)
See a demo .
I solved this problem using postgresql. Extract the logic and convert into your sql dialect.
DDL statement:
create table demo
(
x varchar(10) not null,
y int not null,
z date)
insert into demo(x,y,z) values
('A',1,'2022-01-01'),
('A',1,'2022-01-02'),
('A',1,'2022-01-03'),
('B',3,'2022-01-01'),
('B',3,'2022-01-02'),
('A',1,'2022-01-05');
query:
with base_data as (
select x,y,z,
row_number() over(partition by x,y) as sno
from demo
)
,staging_data as (
select x,y,z, z - coalesce(lag(z) over(partition by x,y),z-1::INT) as diff
from base_data)
select
x,y,z,row_number() over(partition by x,diff)
from staging_data
z-1::INT - instead use date_add(z,-1)- Hope this change will work in sqlserver
output:
x|y|z |row_number|
-+-+----------+----------+
A|1|2022-01-01| 1|
A|1|2022-01-02| 2|
A|1|2022-01-03| 3|
A|1|2022-01-05| 1|
B|3|2022-01-01| 1|
B|3|2022-01-02| 2|
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.