简体   繁体   中英

Duplicates from an SQL Query

I have a dataset I retrieve from multiple joins. I have used SELECT DISTINCT in my statements but I still see duplicates in the result set. Here is the code:

SELECT DISTINCT Account
, PayoffAmtDOL as 'Payoff Amount DOL'
, PayoffAmtLOG as 'Payoff Amount LOG'
, PayoffAmountLive as 'Payoff Amount Live'
, [Difference]
, PrincipalBalance as 'Principal Balance'
, CreationDate as 'Date Entered System'
, CACSState as 'CACS State at Entry'
, PaymentsMade AS 'Payments Made'
, TotalPaymentAmount as 'Total Payment Amount'
, 'Liquidation Percentage' = CASE WHEN PayoffAmountLive = 0 THEN 1
                            WHEN ISNULL([Difference],0) = ISNULL(PayoffAmtDOL, 0) THEN 1
                            WHEN ISNULL([Difference],0) < 0 AND ISNULL(PayoffAmtDOL, 0) > 0 THEN 0
                            WHEN ISNULL([Difference],0) > 0 AND ISNULL(PayoffAmtDOL, 0) < 0 THEN 1
                            WHEN ISNULL([Difference],0) > ISNULL(PayoffAmtDOL, 0) THEN 1
                            WHEN [Difference] > 0 AND ISNULL(PayoffAmtDOL, 0) = 0 THEN 1
                            WHEN ISNULL(PayoffAmtDOL, 0) = 0 THEN 0
                            ELSE ISNULL([Difference],0)/ISNULL(PayoffAmtDOL, 0) END
          , Cnt = 1
FROM 
(
SELECT DISTINCT a.Account,
       c.PayoffAmtDOL,
       c.PayoffAmtLOG,
       (ISNULL(c.PayoffAmtCACS, cacs.payoff_amt)) as 'PayoffAmountLive',
       (ISNULL(c.PayoffAmtDOL, 0) - (ISNULL(c.PayoffAmtCACS , ISNULL(cacs.payoff_amt, 0)))) as 'Difference',
       c.PrincipalBalance,
       c.CreationDate,
       c.CACSState,
       (SELECT COUNT(PaymentID)
        FROM tblATLPaymentInfo p
        WHERE p.AccountID = a.AccountID
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) >=  '1/1/2014'
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) <= '3/27/2014'
        ) as 'PaymentsMade',
         (SELECT SUM(PaymentAmount)
        FROM tblATLPaymentInfo p
        WHERE p.AccountID = a.AccountID
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) >= '1/1/2014'
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) <= '3/27/2014'
        ) as 'TotalPaymentAmount'

FROM tblATLAcctInfo a
RIGHT JOIN tblATLClaimInfo c
    ON c.AccountID = a.AccountID
LEFT JOIN SCFLOKYDCMSQL03.CACS_DM.dbo.Cacs_Info cacs
    ON cacs.Account = a.Account
WHERE CONVERT(DATETIME, CONVERT(VARCHAR(10), c.CreationDate, 101)) >= '1/1/2014'
    AND CONVERT(DATETIME, CONVERT(VARCHAR(10), c.CreationDate, 101)) <=  '3/27/2014'
    AND c.ClaimTypeID = (SELECT  DISTINCT ClaimTypeID FROM tblATLClaimType WHERE ClaimType = 'N02 - Claims')
) a
ORDER BY Account

Here is an example of the duplicate rows:

AccountID   DateEntered
123     01/19/2014
123     01/21/2014
345     02/1/2014
345     02/10/2014

The difference between appears to be the date entered. Maybe selecting the Row_Number() and then deleting the later date could be a solution

DISTINCT should not return multiple rows.. there should be at least one column that is different in each row, no? With character data, sometimes one can be fooled by non-visible differences, such as trailing spaces. Not sure if that is the case here, though.

Can you give an example of the duplicate rows?

OK, I see your edit. You have to select which of the dates to display. Try this to get the earliest date per AccountID:

 SELECT AccountID, MIN(DateEntered) AS DateEntered
 FROM ....
 GROUP BY AccountID
 ORDER BY AccountID

You can add more columns in the SELECT, as long as they are distinct you will not get more rows.

If you want, you can add COUNT(*) to the select to get the number of rows grouped.

DISTINCT will only reject lines that are exact duplicates, the DateEntered is different on each ID. If you want the latest, use Max(DateEntered)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM