简体   繁体   中英

Sort list based on group count

I'd like to sort a List on element counts of IGrouping s.

And that's it, the list should ideally be the same. I would compromise with a new list, but then the elements should be the very same original objects, not copies (however shallow) and definitely not anonymous objects.

Specifically: We have an entity with many properties, and a list of objects of this type. We'd like to (1) group the objects by certain properties (name, address, ...) then (2) count the number of elements in each group. Finally we'd like to (3) reorder the list based on these counts by placing the elements that are part of the larger groups first.

Note: Our main issue is that we can't seem to find a way to keep a reference to the original objects in the elements of the groups. Indeed all we can select in the Linq query is the grouping key (or properties of the key) and nothing else is exposed by IGrouping . We can't figure out how to associate a group element with an element of the list either, short of looking at the data (and even then, we'd need the primary key, which we can't add to the grouping key or it would defeat the purpose of the grouping to begin with).

var mySortedList = myList.GroupBy(x => x.Name).OrderByDescending(g => g.Count())
                       .SelectMany(x => x).ToList();

Almost no operation in .NET clones an object. Neither deeply nor shallowly. LINQ also does not clone the elements it processes. Therefore, a simple LINQ query will work:

var oldList = ...;
var newList = (from x in oldList
               group x by something into g
               orderby g.Count()
               from x in g //flatten the groups
               select x).ToList();

This code copies references to the original objects. If you believe otherwise, you are probably misinterpreting what you are seeing.

Well, this is embarrassing.

My mistake was indeed based on a misunderstanding:

class Item
{
    internal string Value { get; set; }
    internal string Qux { get; set; }
    internal string Quux { get; set; }
}

var query = from i in list
            group i by new { i.Value, i.Qux } into g // Note: no Quux, no primary key, just what's necessary
            orderby g.Count() descending, g.Key.Value
            select new { // [1]
                Value = g.Key.Value,
                Qux = g.Key.Qux,
                // Quux?
            }

My bad assumption was that the selection at [1] was acting on individual records, much like SQL. Well, hopefully I can save someone who had the same assumption: it is not.

It seems obvious now, but the selection acts on individual groups, and hence here a new object would be created for each group, not for each element.

My second mistake was in focusing on the key, wondering how one would "pass" other properties without using it. I was also worried that we seemed to be required to make shallow copies of our objects. Again this is based on the fact that we were clueless as to how the IGrouping is behaving: since we didn't know we were selecting on the groups, we couldn't even create new objects for each element from the select.

The solution is definitely not impressive:

var query = from i in list
            group i by new { i.Value, i.Qux } into g
            orderby g.Count() descending, g.Key.Value
            select g;

foreach (var group in query)
{
    foreach (var item in group)
        Console.WriteLine("{0} {1} {2} [Original object? {3}]",
            item.Value,
            item.Qux,
            item.Quux,
            list.Contains(item));

    Console.WriteLine("-");
}

Output:

AAA Foo ... [Original object? True]
AAA Foo ... [Original object? True]
AAA Foo ... [Original object? True]
-
BBB Foo ... [Original object? True]
BBB Foo ... [Original object? True]
-
AAA Bar ... [Original object? True]
-
CCC Foo ... [Original object? True]
-
DDD Foo ... [Original object? True]
-
DDD Bar ... [Original object? True]
-
EEE Foo ... [Original object? True]

Indeed the IGrouping elements are the original elements. From there we can create a new list without any issue.

Update: I flatten the result set outside of the query mainly to be able to write the group separator to the console for demonstration purposes, but see Tim's and usr's answers to flatten inside the query using SelectMany (and equivalent Linq syntax).

Needless to say, this was another classic case of "we're in too much of a hurry, let's read some examples here and there and that'll be it", where the proper approach would have been to spend a little bit of time learning the fundamentals. Linq is not SQL.

My apologies for the time you may have wasted trying to clear the confusion. Hopefully someone else in a hurry might benefit from these mistakes now.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM