简体   繁体   中英

SQL: Select top 1 of each group returning multiple records for each group

I have a table which looks like this:

| Id | InvestorFundId | Name | AccountKey | AsOfDate | AddedOn  |
| 1  | 11111          | Name1| Key1       | 9/5/2018 | 8/5/2018 |
| 2  | 11111          | Name2| Key1       | 9/3/2018 | 8/5/2018 |
| 3  | 22222          | Name3| Key2       | 9/2/2018 | 8/5/2018 |
| 4  | 33333          | Name4| Key3       | 9/2/2018 | 8/5/2018 |
| 5  | 33333          | Name5| Key3       | 9/4/2018 | 8/5/2018 |

I need to be able to return the most recent InvestorFundId , Name, and AccountKey for each group, based on ordered AsOfDate and AddedOn descending.

Expected result should look like this:

| InvestorFundId | Name | AccountKey |
| 11111          | Name1| Key1       | 
| 22222          | Name3| Key2       |
| 33333          | Name5| Key3       | 

I've looked at some posts but I can't correctly return the rows, here is what I have so far:

SELECT Name, AccountKey, H.InvestorFundId FROM
[Investor.Fund.History] H
CROSS APPLY(SELECT TOP 1 InvestorFundId 
FROM [Investor.Fund.History] 
WHERE  DataStatusId = 1 AND AsOfYearMonth <= 201806
ORDER BY AsOfDate DESC, AddedOn DESC) HS
ORDER BY H.InvestorFundId

But this is returning me multiple records for each group instead of the most recent.

I just want to add that the current implementation uses ROW_NUMBER but it is too slow, so I am investigating into other solutions.

Thank you

NOTE: I know that I have two extra columns in my where, I just opted to exclude those from the diagram but you should be able to get the gist

You can use a CTE to get the data you need.

USE tempdb;
GO

DECLARE @table TABLE (Id INT IDENTITY(1, 1), InvestorFundId INT, Name VARCHAR(50), AccountKey VARCHAR(50), AsOfDate DATE, AddedOn DATE);
INSERT INTO @table VALUES (11111, 'Name1', 'Key1', '9/5/2018', '8/5/2018');
INSERT INTO @table VALUES (11111, 'Name2', 'Key1', '9/3/2018', '8/5/2018');
INSERT INTO @table VALUES (22222, 'Name3', 'Key2', '9/2/2018', '8/5/2018');
INSERT INTO @table VALUES (33333, 'Name4', 'Key3', '9/2/2018', '8/5/2018');
INSERT INTO @table VALUES (33333, 'Name5', 'Key3', '9/4/2018', '8/5/2018');

;WITH CTE AS
(
    SELECT InvestorFundId, Name, AccountKey, ROW_NUMBER() OVER (PARTITION BY InvestorFundID ORDER BY AsOfDate DESC) AS RowId FROM @table
)
SELECT InvestorFundId, Name, AccountKey
    FROM CTE
    WHERE RowId = 1;

Here is a working SQLFiddle

Hope it helps.

You can use WITH TIES and simply apply ROW_NUMBER in the ORDER BY

Select top 1 with ties *
From History
Where DataStatusId = 1 and AsOfYearMonth = 201806
Order by 
Row_Number() over (partition by InvestorFundID order by AsOfDate desc)

Or with a sub query

Select  *
    From (select * , RN= Row_Number() over (partition by InvestorFundID order by AsOfDate desc)
             From History
             Where DataStatusId = 1 and AsOfYearMonth = 201806) x
Where x.RN = 1

If you find this slower, then we'd need to see the execution plan to determine WHY it's slow. A non-clustered index on InvestorFundId, AsOfDate desc would make this super fast.

Create nonclustered index indexName on
History (InvestorFundID, AsOfDate desc)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM