[英]Converting PostgreSQL recursive CTE to SQL Server
我在將一些遞歸 CTE 代碼從 PostgreSQL 改編到 SQL Server 時遇到了麻煩,來自“Fighting Churn with Data”一書
這是工作的 PostgreSQL 代碼:
with recursive
active_period_params as (
select interval '30 days' as allowed_gap,
'2021-09-30'::date as calc_date
),
active as (
-- anchor
select distinct account_id, min(start_date) as start_date
from subscription inner join active_period_params
on start_date <= calc_date
and (end_date > calc_date or end_date is null)
group by account_id
UNION
-- recursive
select s.account_id, s.start_date
from subscription s
cross join active_period_params
inner join active e on s.account_id=e.account_id
and s.start_date < e.start_date
and s.end_date >= (e.start_date-allowed_gap)::date
)
select account_id, min(start_date) as start_date
from active
group by account_id
這是我嘗試轉換為 SQL Server。 它陷入了一個循環。 我相信這個問題與 SQL Server 所需的 UNION ALL 有關。
with
active_period_params as (
select 30 as allowed_gap,
cast('2021-09-30' as date) as calc_date
),
active as (
-- anchor
select distinct account_id, min(start_date) as start_date
from subscription inner join active_period_params
on start_date <= calc_date
and (end_date > calc_date or end_date is null)
group by account_id
UNION ALL
-- recursive
select s.account_id, s.start_date
from subscription s
cross join active_period_params
inner join active e on s.account_id=e.account_id
and s.start_date < e.start_date
and s.end_date >= dateadd(day, -allowed_gap, e.start_date)
)
select account_id, min(start_date) as start_date
from active
group by account_id
訂閱表是屬於客戶的訂閱列表。 客戶可以有多個具有重疊日期或日期間隔的訂閱。 null end_date 表示訂閱當前處於活動狀態並且沒有定義的 end_date。 下面是單個客戶 (account_id = 15) 的示例數據:
subscription
---------------------------------------------------
| id | account_id | start_date | end_date |
---------------------------------------------------
| 6 | 15 | 01/06/2021 | null |
| 5 | 15 | 01/01/2021 | null |
| 4 | 15 | 01/06/2020 | 01/02/2021 |
| 3 | 15 | 01/04/2020 | 15/05/2020 |
| 2 | 15 | 01/03/2020 | 15/05/2020 |
| 1 | 15 | 01/06/2019 | 01/01/2020 |
預期查詢結果(由 PostgreSQL 代碼生成):
------------------------------
| account_id | start_date |
------------------------------
| 15 | 01/03/2020 |
問題:上面的 SQL Server 代碼卡在循環中並且不會產生結果。
PostgreSQL 代碼說明:
任何幫助表示贊賞!
問題似乎與 SQL Server 處理遞歸 CTE 的方式有關。
這是一種間隙和孤島問題,實際上並不需要遞歸。
有很多解決方案,這里是一個。 根據您的要求,可能有更有效的方法,但這應該可以幫助您入門。
LAG
我們識別下一行的指定間隙內的行COUNT
給每一個連續的行集一個 IDstart_date
,過濾掉不符合條件的組DECLARE @allowed_gap int = 30,
@calc_date datetime = cast('2021-09-30' as date);
WITH PrevValues AS (
SELECT *,
IsStart = CASE WHEN ISNULL(LAG(end_date) OVER (PARTITION BY account_id
ORDER BY start_date), '2099-01-01') < DATEADD(day, -@allowed_gap, start_date)
THEN 1 END
FROM subscription
),
Groups AS (
SELECT *,
GroupId = COUNT(IsStart) OVER (PARTITION BY account_id
ORDER BY start_date ROWS UNBOUNDED PRECEDING)
FROM PrevValues
),
ByGroup AS (
SELECT
account_id,
GroupId,
start_date = MIN(start_date)
FROM Groups
GROUP BY account_id, GroupId
HAVING COUNT(CASE WHEN start_date <= @calc_date
and (end_date > @calc_date or end_date is null) THEN 1 END) > 0
)
SELECT
account_id,
start_date = MIN(start_date)
FROM ByGroup
GROUP BY account_id;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.