[英]Merge groups of consecutive rows in T-SQL and sum values from each group
2019 年 10 月 8 日更新:
@Gordon Linoff:我嘗試應用您的解決方案,但我意識到它沒有按預期工作。 我在此處添加了一個帶有預期結果的示例( https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=1b486476d6aeab25997f25e66ee455e9 ),如果您能幫助我,我將不勝感激。
--
我有一個帶有模式的事務表:
CREATE TABLE Transactions (Id int IDENTITY, SessionId int, TransactionType varchar(50), DateTimeEnd datetime, DateStart datetime, Rank int);
以下是一些行示例:
INSERT INTO Transactions (Id, SessionId, TransactionType, DateTimeEnd, DateStart, Rank)
VALUES
(1, 1, 'Deposit', '2017-01-20T11:16:33Z', '2017-01-20T11:16:33Z', 600),
(2, 1, 'Withdrawal', '2017-01-21T11:16:33Z', '2017-01-20T11:16:33Z', 100),
(3, 2, 'Deposit', '2017-02-23T11:16:33Z', '2017-02-23T11:16:33Z', 500),
(4, 1, 'Withdrawal', '2017-01-24T11:16:33Z', '2017-01-21T11:16:33Z', 150),
(5, 1, 'Withdrawal', '2017-01-26T11:16:33Z', '2017-01-24T11:16:33Z', 150),
(6, 2, 'Withdrawal', '2017-02-27T11:16:33Z', '2017-02-23T11:16:33Z', 200),
(7, 1, 'Withdrawal', '2017-01-28T11:16:33Z', '2017-01-26T11:16:33Z', 10),
(8, 1, 'Withdrawal', '2017-01-30T11:16:33Z', '2017-01-28T11:16:33Z', 10),
(9, 1, 'Withdrawal', '2017-01-31T11:16:33Z', '2017-01-30T11:16:33Z', 10);
我想要的是一個 T-SQL 查詢,用於按 SessionId、TransactionType 和每個組合並連續行的組,以僅保留具有最小 DateTimeEnd 的行。 此外,保留的行的等級值必須是組中各行等級值的總和。 T-SQL 查詢需要在 Microsoft Azure SQL 數據倉庫中的 MS SQL 服務器中運行。
期望的結果:
| Id | SessionId | Transaction | DateTimeEnd | DateStart | Rank |
|----------|------------------|-------------|--------------------|--------------------|---------|
| 1 | 1 | Deposit|2017-01-20T11:16:33Z|2017-01-20T11:16:33Z| 600 |
| 2 | 1 | Withdrawal|2017-01-21T11:16:33Z|2017-01-20T11:16:33Z| 100 |
| 4 | 1 | Withdrawal|2017-01-24T11:16:33Z|2017-01-21T11:16:33Z| 300 |
| 7 | 1 | Withdrawal|2017-01-28T11:16:33Z|2017-01-26T11:16:33Z| 30 |
| 3 | 2 | Deposit|2017-02-23T11:16:33Z|2017-02-23T11:16:33Z| 500 |
| 6 | 2 | Withdrawal|2017-02-27T11:16:33Z|2017-02-23T11:16:33Z| 200 |
我嘗試了很多方法,但無法實施。
正如 GMB 指出的那樣,這是一個缺口和孤島問題。 因為您想保留第一行,所以我建議使用lag()
方法而不是行號差異:
SELECT SessionId, TransactionType, DateTimeEnd,DateStart, sumRank
FROM (SELECT t.*,
SUM(Rank) OVER (PARTITION BY SessionId, TransactionType, grp) as sumRank
FROM (SELECT t.*,
SUM(CASE WHEN prev_st_id = prev_id THEN 0 ELSE 1 END) OVER (ORDER BY id) as grp
FROM (SELECT t.*,
LAG(id) OVER (PARTITION BY SessionId, TransactionType ORDER BY id) as prev_st_id,
LAG(id) OVER (PARTITION BY SessionId ORDER BY id) as prev_id
FROM Transactions t
) t
) t
) t
WHERE prev_st_id <> prev_id OR prev_st_id IS NULL;
這是做什么的?
id
因為它看起來比日期/時間更穩定(其中一列中有重復的日期/時間值)。grp
計算整個組的值。這是一個 db<>fiddle。
這是一個缺口和孤島的變體。
我會按如下方式處理它:
首先,識別並合並記錄組。 以下查詢為您提供每個組的最小DateTimeEnd
以及排名的總和
SELECT SessionId、TransactionType、SUM(Rank) SumRank、MIN(DateTimeEnd) MinDateTimeEnd FROM ( SELECT t.*, ROW_NUMBER() OVER(ORDER BY DateTimeEnd) rn1, ROW_NUMBER() OVER(PARTITION BY SessionId, TransactionType ORDER BY DateTimeEnd) rn2 從事務 t ) x GROUP BY SessionId, TransactionType, rn1 - rn2
退貨:
SessionId | TransactionType | SumRank | MinDateTimeEnd --------: | :-------------- | ------: | :------------------ 1 | Deposit | 600 | 20/01/2017 11:16:33 1 | Withdrawal | 430 | 21/01/2017 11:16:33 2 | Deposit | 500 | 23/02/2017 11:16:33 2 | Withdrawal | 200 | 27/02/2017 11:16:33
然后,將上述查詢的結果與原始表連接起來,以提取列的 rest:
SELECT t.id, t.SessionId, t.TransactionType, t.DateTimeEnd, t.DateStart, x.SumRank FROM Transactions t INNER JOIN ( SELECT SessionId, TransactionType, SUM(Rank) SumRank, MIN(DateTimeEnd) MinDateTime8t143 MinDateTime8t148 (SELECT SELECT .*, ROW_NUMBER() OVER(ORDER BY DateTimeEnd) rn1, ROW_NUMBER() OVER(PARTITION BY SessionId, TransactionType ORDER BY DateTimeEnd) rn2 FROM Transactions t ) x GROUP BY SessionId, TransactionType, rn1 - rn2 ) x ON x.SessionId = t.SessionId 和 x.TransactionType = t.TransactionType 和 x.MinDateTimeEnd = t.DateTimeEnd
產量:
id | SessionId | TransactionType | DateTimeEnd | DateStart | SumRank -: | --------: | :-------------- | :------------------ | :------------------ | ------: 1 | 1 | Deposit | 20/01/2017 11:16:33 | 20/01/2017 11:16:33 | 600 2 | 1 | Withdrawal | 21/01/2017 11:16:33 | 20/01/2017 11:16:33 | 430 3 | 2 | Deposit | 23/02/2017 11:16:33 | 23/02/2017 11:16:33 | 500 6 | 2 | Withdrawal | 27/02/2017 11:16:33 | 23/02/2017 11:16:33 | 200
注意:如評論所述,我認為您顯示的預期結果存在問題。 id
為4
和7
的行不應出現在 output 中,因為 id 為2
的行具有相同的SessionId
和TransactionType
以及更早的DateTimeEnd
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.