简体   繁体   中英

Group by multiple columns as if it was one column

Is there a way to group by multiple columns as if the columns were united in C# using Linq? Lets say I have an object that represents a message sent from one user to another:

public class Message
{
    public Int32 SenderId { get; set; }
    public Int32 RecipientId { get; set; }
    public String Text { get; set; }
    public DateTime SentAt { get; set; }
}

I get a set of these from repository and then I have to order them by sent date and filter out by pair SenderId, RecipientId. But I want to get back only the very last message from a given dialog between two users. The direction doesn't matter, so (SenderId == 1 && RecipientId == 2) should be treated as (SenderId == 2 && RecipientId == 1).

So far I got this:

Messages
    .GroupBy(m => new { m.SenderId, m.RecipientId })
    .Select(gm => gm.OrderByDescending(m => m.SentAt).FirstOrDefault());

But the problem with this approach is when I have a message sent from user 1 to user 2 and vice versa 2 => 1, both will be returned. Because Key (1,2) isn't the same as (2,1).

Any suggestions would be great. Thank you!

UPD: I should've added that I'm using the expression with IQueriable expression. So it's LINQ to Entity.

Even though you tagged your question , not everyone seemed to understand that many in-memory solutions fail because they don't have a SQL equivalent. Here's a query that will translate into SQL for many different IQueryable implementations and query providers:

Messages
    .GroupBy(m => new
    {
        MinId = m.SenderId <= m.RecipientId ? m.SenderId : m.RecipientId,
        MaxId = m.SenderId > m.RecipientId ? m.SenderId : m.RecipientId
    })
    .Select(gm => gm.OrderByDescending(m => m.SentAt).FirstOrDefault());

Using Math.Min / Math.Max will only work in Entity Framework core 2, because this is the only EF version that auto-switches to client-side evaluation for any part of the query that doesn't translate into SQL. This was abandoned in EF core 3. EF core 3 only applies client-side evaluation in the final projection ( Select ), if necessary. So the GroupBy will fail in EF core 3, which you're going to use sooner or later.

However, I have to warn you that currently, this won't work in EF core 3, because they don't support GroupBy fully yet. But I'm sure that's a matter of time.

GroupBy accepts a custom equality comparer . You could implement your specific matching logic there like so:

class MessageComparer : IEqualityComparer<Message>
{
    public bool Equals(Message x, Message y)
    {
        return (x.SenderId == y.SenderId || x.SenderId == y.RecipientId)
        && (x.RecipientId == y.RecipientId || x.RecipientId == y.SenderId); // i guess you can go as specific as you like here, depending on your requirements
    }

    public int GetHashCode(Message obj)
    {
        return (obj.SenderId ^ obj.RecipientId).GetHashCode();
    }
}

in your calling method, do something like so, with a custom comparer: Messages.GroupBy(m => m, new MessageComparer()) .Select(gm => gm.OrderByDescending(m => m.SentAt).FirstOrDefault());

UPD: using LINQ to Entities will change the solution quite a bit then. You are right, custom functions and equality comparers will not be supported by EF as it needs to translate this to SQL somehow (even if you manage to do it, your query performance will likely take a nosedive due to lack of functional indexes). I see a couple of avenues to explore:

  1. select all messages for both users with no grouping and apply grouping in C# (where you've got your custom comparers and can do pretty much anything like so: Messages.Select(gm => gm.OrderByDescending(m => m.SentAt).ToList(/*DB call happens here */).GroupBy(m => m, new MessageComparer()).FirstOrDefault() - this however potentially fetches a lot of data from the DB so you will need to consider filtering
  2. change your database schema to introduce a SequenceNumber column that will be shared between sender and recipient and you can use that in SQL will all indexes you like
  3. write a stored proc that does some sort of trickery on your two columns and gives you desired output (for example you could sort two integers and concatenate them to produce one value that you then rely on).

Since both Id's are integers, you can order them in your group by:

Messages
    .GroupBy(m => new { Math.Min(m.SenderId, m.RecipientId), Math.Max(m.SenderId, m.RecipientId)})
    .Select(gm => gm.OrderByDescending(m => m.SentAt).FirstOrDefault());

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM