简体   繁体   中英

LINQ and GroupBy

I haven't done much LINQ before, so I often find some aspects confusing. Recently someone created a query that looks like the following using the GroupBy operator. Here's what they did:

List<int> ranges = new List<int>() {100, 1000, 1000000};

List<int> sizes = new List<int>(new int[]{99,98,10,5,5454, 12432, 11, 12432, 992, 56, 222});

var xx = sizes.GroupBy (size => ranges.First(range => range >= size));

xx.Dump();

Basically I am very quite confused as to how the key expression works, ie ranges.First(range => range >= size

Can anyone shed some light? Can it be decomposed further to make this easier to understand? I thought that First would produce one result.

Thanks in advance.

size => ranges.First(range => range >= size) this Func builds key, on which sizes will be grouped. It takes current size and finds first range, which is greater or equal current size.


How it works:

For size 99 first range which >= 99 is 100 . So, calculated key value will be 100 . Size goes to group with key 100 .

Next sizes 98 , 10 , 5 also will get key 100 and go to that group.

For size 5454 calculated key value will be 1000000 (it's the first range which is greater that 5454 . So, new key is created, and size goes to group with key 1000000 .

Etc.

ranges.First(range => range >= size) returns an int , the first range that is >= the current size value. So every size belongs to one range. That is the group.

Note that First throws an exception if there's no range which is >= the given size.

If you write the code with for loop it looks like this:

var myGroup = new Dictionary<int, List<int>>();

foreach( size in sizes)
{
    // ranges.First(range => range >= size) is like bellow
    range = find minimum value in ranges which is greater than equal to size;

    // this grouping will be done autamatically by calling GroupBy in your code:
    if (myGroup[range] has no value) // actually TryGetValue
      myGroup[range] = new List<int>();

    // this addition will be done by each iteration on your inputs.
    myGroup[range].Add(item);
}

Just difference in your linq command is, it doesn't works with for loop, actually it works with hash table, and it's faster (in average), and if you learn linq well, it's more readable.

Not sure whether it adds to the clarity, but if you really want to break it down, you could do the following (I'm guessing you are using LinqPad)

   List<int> ranges = new List<int>() {100, 1000, 1000000};
   List<int> sizes = new List<int>(new int[]{99,98,10,5,5454, 12432, 11, 12432, 992,    56, 222});

   void Main()
   {
        var xx = sizes.GroupBy (size => GetRangeValue(size));

        xx.Dump();
    }

   private int GetRangeValue(int size)
   {
        // find the first value in ranges which is bigger than or equal to our size
        return ranges.First(range => range >= size);
    }

And yes, you are correct, First does produce one result.

Indeed, first returns one value, which becomes key for grouping.

What happens here is - First is called for each value in sizes, returning the first range larger than size (100,100,100,100,1000000, 1000000, etc) - "sizes" are grouped by this value. For every range a grouping is returned, for instance 100: 99,98,10,5,11...

GroupBy essentially builds a lookup table (dictionary) where each of the items in your source that meets a common condition is grouped into a list and then assigned to a key in the lookup table.

Here is a sample program that replaces your call to xx.Dump() with a code block that pretty-prints the output in a way specific to your example. Notice the use of OrderBy to first order the keys (range values) as well as group of items associated with each range.

using System;
using System.Collections.Generic;
using System.Linq;

class GroupByDemo
{
    static public void Main(string[] args)
    {
        List<int> ranges = new List<int>() {100, 1000, 1000000};

        List<int> sizes = new List<int>(
            new int[]{99,98,10,5,5454, 12432, 11, 12432, 992, 56, 222});

        var sizesByRange =
            sizes.GroupBy(size => ranges.First(range => range >= size));

        // Pretty-print the 'GroupBy' results.
        foreach (var range in sizesByRange.OrderBy(r => r.Key))
        {
            Console.WriteLine("Sizes up to range limit '{0}':", range.Key);

            foreach (var size in range.ToList().OrderBy(s => s))
            {
                Console.WriteLine("  {0}", size);
            }
        }
        Console.WriteLine("--");
    }
}

Expected Results

Notice that 12432 appears twice in the last group because that value appears twice in the original source list.

Sizes up to range limit '100':
  5
  10
  11
  56
  98
  99
Sizes up to range limit '1000':
  222
  992
Sizes up to range limit '1000000':
  5454
  12432
  12432
--

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM