简体   繁体   中英

Summarize two columns that have a many-to-many relationship in an SQL database

I have the following database tables Transactions and Payments . The data is structured as such that both Transactions and Payments have a many-to-many relationship, which are joined together in a query as such:

SELECT Transactions.Id, Payments.Id, Transactions.Amount
FROM Transactions 
LEFT JOIN Payments ON Transactions.Id = Payments.TransactionId 

| Transactions.Id | Payments.PaymentId | Transactions.Amount |
| ----------------| -------------------| --------------------|
| 3492            | 123456             | 123                 |
| 3492            | 123457             | 123                 |
| 3493            | 123458             | 300                 |
| 3494            | 123459             | 10                  |
| 3495            | 123459             | 25                  |

I want to be able to simplify the table into something like below (if this is at all even possible in SQL)

| Transactions.Id | Payments.PaymentId | Transactions.Amount |
| ----------------| -------------------| --------------------|
| 3492            | 123456, 123457     | 123                 |
| 3493            | 123458             | 300                 |
| 3494, 3495      | 123459             | 35                  |

...or even without the third column (if it reduces complexity)

| Transactions.Id | Payments.PaymentId |
| ----------------| -------------------|
| 3492            | 123456, 123457     |
| 3493            | 123458             |
| 3494, 3495      | 123459             |

What I have so far:

SELECT
Transactions.Id, 
   STUFF((SELECT '; ' + CAST(Payments.Id as varchar)
    FROM Payments 
    WHERE Payments.TransactionId = Transactions.Id
    FOR XML PATH('')), 1, 1, '') [PaymentIds]
FROM Transactions

| Transactions.Id | PaymentIds         |
| ----------------| -------------------|
| 3492            | 123456, 123457     |
| 3493            | 123458             |
| 3494            | 123459             |
| 3495            | 123459             |

How do I add Transactions.Id into a comma separated list as well? I am using SQL Server 12.0.2000.8

SQLFiddle here

Basically, you have to take two passes in order to aggregate the payments per transaction and then the transaction per payment aggregate. This assumes (based on the sample data and desired output) that when a payment applies to multiple transactions, you want the sum, but when a transaction has multiple payments, you want the max (or they'll always be the same in which case max works just fine?):

;WITH FirstPass AS 
(
  SELECT t.Id, pId = STUFF(
    (SELECT ', ' + CONVERT(varchar(11), p.Id)
      FROM dbo.Payments AS p
      WHERE p.TransactionId = t.Id
      ORDER BY p.Id
      FOR XML PATH(''), TYPE).value(N'./text()[1]', 
        N'varchar(max)'), 1, 2, ''),
    Amount = MAX(t.Amount)
  FROM dbo.Transactions AS t
  GROUP BY t.Id
)
SELECT [Transactions.Id] = STUFF(
  (SELECT ', ' + CONVERT(varchar(11), fp.Id)
    FROM FirstPass AS fp
    WHERE fp.pId = FirstPass.pId
    ORDER BY fp.pId
    FOR XML PATH(''), TYPE).value(N'./text()[1]', 
        N'varchar(max)'), 1, 2, ''),
  [Payments.PaymentId] = FirstPass.pId,
  Amount = SUM(FirstPass.Amount)
FROM FirstPass 
GROUP BY FirstPass.pId;

Output:

Transactions.Id Payments.PaymentId Amount
3492 123456, 123457 123.00
3493 123458 300.00
3494, 3495 123459 35.00

On SQL Server 2017+ (or Azure SQL Database) this becomes much simpler, though still a little convoluted for this specific use case. This doesn't help you right now, but could help other readers today, future readers, or even future you:

;WITH FirstPass AS 
(
  SELECT t.Id, ca.pId, Amount = MAX(t.Amount)
  FROM dbo.Transactions AS t
  CROSS APPLY
  (
    SELECT pId = STRING_AGG(p.Id, ', ')
      FROM dbo.Payments AS p
      WHERE p.TransactionId = t.Id
  ) AS ca GROUP BY t.Id, ca.pId
)
SELECT [Transactions.Id] = STRING_AGG(Id, ', '),
  [Payments.PaymentId] = pId,
  Amount = SUM(Amount)
FROM FirstPass 
GROUP BY pId;

Same output:

The following code generates the output you want

WITH cte1 AS(
   SELECT 
       Transactions.Id AS TransactionID, 
       Payments.Id AS PaymentID,
       max(Transactions.amount) Amount
   FROM Transactions 
   LEFT JOIN Payments ON Transactions.Id = Payments.TransactionId 
   GROUP BY Transactions.Id,Payments.Id),
   
   
cte2 AS(
  SELECT

       STUFF((SELECT ',' + cast(t2.TransactionID as varchar(100)) 
       FROM cte1 t2
       WHERE t2.PaymentID = t1.PaymentID
       FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'')TransactionID,
     
       STUFF((SELECT ',' + cast(t2.PaymentID as varchar(100)) 
       FROM cte1 t2
       WHERE t2.TransactionID = t1.TransactionID
       FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'')PaymentID,
     
       (SELECT sum(Amount)
       FROM cte1 t2 
       WHERE t2.PaymentID = t1.PaymentID
       GROUP BY PaymentID)Amount
     
    
     
   FROM cte1 t1
)

SELECT TransactionID,PaymentID,Amount
FROM cte2
GROUP BY TransactionID,PaymentID,Amount

result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM