简体   繁体   中英

How to group records and retrieve only first group with top N records

I have the following record set

ID          BatchID     ClientName           CreatedDateTime
----------- -------------- --------------- -----------------------
1           NULL           B             2018-02-16 19:07:46.320
2           NULL           B             2018-02-16 19:07:46.320
3           NULL           B             2018-02-16 19:07:46.597
4           NULL           B             2018-02-16 19:07:46.597
5           NULL           B             2018-02-16 19:10:10.260
6           NULL           B             2018-02-16 19:10:10.260
7           NULL           B             2018-02-16 19:21:34.303
8           NULL           B             2018-02-16 19:21:34.303
9           NULL           B             2018-02-16 19:21:44.780
10          NULL           B             2018-02-16 19:21:44.780
11          NULL           A             2018-02-16 19:24:35.623
12          NULL           A             2018-02-16 19:24:35.623
13          NULL           A             2018-02-16 19:24:42.867
14          NULL           A             2018-02-16 19:24:42.867

I am using LINQ to SQL in EF Core.

I want to filter the records where BatchID is NULL and then order the filtered records by CreatedDateTime and then group them by ClientName and then take top 5 records from the first Group.

Based on the given record set above it should return records with Ids 1,2,3,4,5 for ClientName B

So here is my query

 var result = await _DBContext.BatchRequests
                .Where(x => x.BatchID.HasValue == false)
                .OrderBy(x => x.CreatedDateTime)
                .GroupBy(x => x.ClientName)
                .FirstAsync();

ISSUE
1> The query returns Client A
2> How do i Take only top 5 records

Update 1

Sql Profiler show the following, it doesnt even group in SQL

SELECT [x].[ID], [x].[BatchID], [x].[ClientName], [x].[CreatedDateTime]
FROM [BatchRequests] AS [x]
WHERE CASE
    WHEN [x].[BatchID] IS NULL
    THEN CAST(1 AS BIT) ELSE CAST(0 AS BIT)
END <> 0
ORDER BY [x].[ClientName]

First, usually OrderBy has no effect (is ignored) if followed by GroupBy in Queryable implementations which translate LINQ queries to SQL.

Second, EF Core currently does not translate GroupBy queries to SQL, but processes them in memory (so called client evaluation ), which makes them highly inefficient. With that taken into account, you'd better split the work on two queries - one to take the ClientName of the first group, and second to take the desired result:

var baseQuery = _DBContext.BatchRequests
    .Where(x => x.BatchId == null)
    .OrderBy(x => x.CreatedDateTime);

var clientName = await baseQuery
    .Select(x => x.ClientName)
    .FirstOrDefaultAsync();

var result = await baseQuery
    .Where(x => x.ClientName == clientName)
    .Take(5)
    .ToListAsync();

Actualy you can combine the two queries, but I'm not sure whether it will be more efficient (could be worse):

var baseQuery = _DBContext.BatchRequests
    .Where(x => x.BatchId == null)
    .OrderBy(x => x.CreatedDateTime);

var result = await baseQuery
    .Where(x => x.ClientName == baseQuery.Select(y => y.ClientName).FirstOrDefault())
    .Take(5)
    .ToListAsync();

You will have to project the group result like something this:

result = await _DBContext.BatchRequests
            .Where(x => x.BatchID.HasValue == false)
            .OrderBy(x => x.CreatedDateTime)
            .ThenBy(x => x.ClientName)
            .GroupBy(x => x.ClientName)
            .Select( x => new { ClientName= x.ClientName,
                                 TopFive = x.Take(5)
                         })
            .FirstAsync();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM