简体   繁体   English

C#Join和Lambda表达式

[英]c# Join and lambda expression

I'm trying to understand the following line of code. 我正在尝试理解以下代码行。 Sequence1 and 2 are two array of strings. Sequence1和2是两个字符串数组。 The code is supposed to achieve an inner-join effect. 该代码应该实现内部联接的效果。 Can someone please help to explain how to read it? 有人可以帮忙解释一下如何阅读吗? ie what x => x.gn2 is? 即x => x.gn2是什么? I understand that n1 => n1.Length is join condition. 我知道n1 => n1.Length是联接条件。 I'm struggling with lambda expressions. 我在lambda表达式上苦苦挣扎。 Many thanks in advance! 提前谢谢了!

var j = sequence1.GroupJoin ( sequence2 , 
n1 => n1.Length , n2 => n2.Length , (n1, gn2) => new { n1, gn2 })
.SelectMany (x => x.gn2,(x, n2) => new { x.n1, n2 });

I am not sure that the expression does what you think it does. 我不确定该表达方式是否符合您的想法。 But here it is. 但是在这里。 Let's rewrite this a little bit: 让我们重写一下:

static void Foo1()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var result = sequence1.GroupJoin(sequence2,
    n1 => n1.Length, n2 => n2.Length, (n1, gn2) => new { n1, gn2 })
    .SelectMany(x => x.gn2, (x, n2) => new { x.n1, n2 });

    result.ToList().ForEach(Console.WriteLine);
}

and now rewrite it again in another equivalent form: 然后再次以另一种等效形式重写它:

static void Foo2()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var joinResult = sequence1.GroupJoin(
        sequence2,
        n1 => n1.Length,
        n2 => n2.Length,
        (n1, gn2) => new {n1, gn2});

    Console.WriteLine("joinResult: ");
    joinResult.ToList().ForEach(Console.WriteLine);

    var result = joinResult.SelectMany(
        x => x.gn2,
        (x, n2) => new { x.n1, n2 });

    Console.WriteLine("result: ");
    result.ToList().ForEach(Console.WriteLine);
}

Now let's take the first part (the GroupJoin): 现在,让我们开始第一部分(GroupJoin):

    var joinResult = sequence1.GroupJoin(
        sequence2,
        n1 => n1.Length,
        n2 => n2.Length,
        (n1, gn2) => new {n1, gn2});

We are joining two collections. 我们正在加入两个集合。 Note that GroupJoin is an extension method that is invoked on sequence1. 请注意, GroupJoin是在sequence1上调用的扩展方法。 Reading the documentation of GroupJoin we see that sequence1 is the outer sequence and the first parameter sequence2 is the inner sequence. 阅读的文档GroupJoin我们可以看到, sequence1是外序列的第一个参数sequence2是内部序列。
The second parameter n1 => n1.Length is a method that based on each element of the outer collection generates the key of that element. 第二个参数n1 => n1.Length是基于外部集合的每个元素生成该元素的键的方法。
The third parameter n2 => n2.Length is a method that based on each element of the inner collection generates the key of that element. 第三个参数n2 => n2.Length是基于内部集合的每个元素生成该元素的键的方法。
GroupJoin now has enough data to match elements of the first sequence with elements of the second sequence. 现在,GroupJoin具有足够的数据以将第一个序列的元素与第二个序列的元素进行匹配。 In our case strings are matched based on their length. 在我们的情况下,字符串根据其长度进行匹配。 All strings of length 2 from the first sequence are matched with strings of the same length 2 in the second sequence. 第一个序列中所有长度为2的字符串都与第二个序列中相同长度为2的字符串匹配。 All strings of length 3 from the first sequence are matched with strings of the same length 3 in the second sequence. 第一个序列中所有长度为3的字符串都与第二个序列中相同长度为3的字符串匹配。 And so on for any value of the length of a string. 对于字符串长度的任何值,依此类推。
The last parameter (n1, gn2) => new {n1, gn2} is a method that based on an element from the outer sequence (that is sequence1 ) and a collection with all matching elements from sequence2 will generate some result. 最后一个参数(n1, gn2) => new {n1, gn2}是一种方法,该方法基于外部序列中的元素(即sequence1 ),并且具有来自sequence2所有匹配元素的集合将产生一些结果。 In this case the result is an anonymous type with two fields: 在这种情况下,结果是具有两个字段的匿名类型:

  • The first field named n1 is the element from sequence1 . 名为n1的第一个字段是来自sequence1的元素。
  • The second field named gn2 is the collection of all matching elements from sequence2 . 第二个名为gn2字段是sequence2中所有匹配元素的集合。

Next comes the SelectMany : 接下来是SelectMany

var result = joinResult.SelectMany(
    x => x.gn2,
    (x, n2) => new { x.n1, n2 });

SelectMany is an extension method that here is invoked on joinResult . SelectMany是一个扩展方法,在这里可以对joinResult调用。 Take a moment and look at the end of my post where I copied the output of the application to see how the joinResult sequence looks like. 花点时间看一下我文章的末尾,我在其中复制了应用程序的输出,以查看joinResult序列的样子。 Note that each element x in joinResult is an anonymous type with fields {n1, gn2} where gn2 itself is a sequence. 请注意, joinResult中的每个元素x是具有字段{n1, gn2}的匿名类型,其中gn2本身是一个序列。

The first parameter x => x.gn2 is a delegate written in lambda form. 第一个参数x => x.gn2是以lambda形式编写的委托。 SelectMany will call this method for each element of the input sequence joinResult . SelectMany将为输入序列joinResult每个元素调用此方法。 SelectMany calls this method so that with each call you have the chance to generate an intermediate collection. SelectMany调用此方法,以便您每次调用都有机会生成中间集合。 Remember that each element x in joinResult is an anonymous type with fields {n1, gn2} where gn2 itself is a sequence. 请记住, joinResult中的每个元素x是具有字段{n1, gn2}的匿名类型,其中gn2本身是一个序列。 Having this, the lambda x => x.gn2 transforms each element x in the collection x.gn2. 有了这个,lambda x => x.gn2转换集合x.gn2中的每个元素x。

Now that SelectMany based on each element of the input sequence can generate a new intermediate sequence it will proceed to process that intermediate sequence. 现在,基于输入序列的每个元素的SelectMany可以生成一个新的中间序列,它将继续处理该中间序列。 For that we have the second parameter. 为此,我们有第二个参数。

The second parameter (x, n2) => new { x.n1, n2 } is another delegate written in lambda form. 第二个参数(x, n2) => new { x.n1, n2 }是另一个以lambda形式编写的委托。 This delegate is called by SelectMany for each element of the intermediate sequence with two parameters: SelectMany对具有两个参数的中间序列的每个元素调用此委托:

  • The first parameter is the current element from the input sequence. 第一个参数是输入序列中的当前元素。
  • The second parameter is a successive element of the intermediate sequence. 第二个参数是中间序列的连续元素。

This lambda transforms these two parameters into another anonymous type with two fields: 此lambda将这两个参数转换为具有两个字段的另一个匿名类型:

  • The first field named n1 . 第一个字段名为n1 If you followed the data flow that is from field n1 from the collection in joinResult ). 如果遵循的是来自joinResult集合中字段n1的数据流。
  • The second field named n2 is the current element of the intermediate sequence. 名为n2的第二个字段是中间序列的当前元素。

This all sounds awfully complicated but if you debug the app and place some breakpoints on strategic points it will become clear. 这一切听起来非常复杂,但是如果您调试应用程序并将一些断点放在战略要点上,它将变得很清楚。

Lets rewrite this one more time in an equivalent form: 让我们用等效形式再重写一次:

static void Foo3()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var joinResult = sequence1.GroupJoin(
        sequence2,
        element1 => GetKey1(element1),
        element2 => GetKey2(element2),
        (n1, gn2) =>
        {
            // place a breakpoint on the next line
            return new {n1, gn2};
        });

    Console.WriteLine("joinResult: ");
    joinResult.ToList().ForEach(Console.WriteLine);

    var result = joinResult.SelectMany(
        x =>
        {
            // place a breakpoint on the next line
            return x.gn2;
        },
        (x, n2) =>
        {
            // place a breakpoint on the next line
            return new {x.n1, n2};
        });

    Console.WriteLine("result: ");
    result.ToList().ForEach(Console.WriteLine);
}

private static int GetKey1(string element1)
{
    // place a breakpoint on the next line
    return element1.Length;
}

private static int GetKey2(string element2)
{
    // place a breakpoint on the next line
    return element2.Length;
}

I suggest you run method Foo3 that is the most verbose and put breakpoints where indicated. 我建议您运行最详细的方法Foo3,并在指示的位置放置断点。 That will help you to figure out in more details how all this works. 这将帮助您更详细地了解所有这些工作原理。

Finally, I must say that one reason all this appears as complicated as it does is because how variables were named. 最后,我必须说,所有这一切看起来一样复杂的原因之一是因为变量的命名方式。 Here is another form, not as verbose as Foo3 that may be reasonably easy to read: 这是另一种形式,不像Foo3那样冗长,可能很容易阅读:

static void Foo4()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var groupJoinResult = sequence1.GroupJoin(
        sequence2,
        elementFromSequence1 => elementFromSequence1.Length,
        elementFromSequence2 => elementFromSequence2.Length,
        (elementFromSequence1, matchingCollectionFromSequence2) => new { elementFromSequence1, matchingCollectionFromSequence2 });

    var result = groupJoinResult.SelectMany(
        inputElement => inputElement.matchingCollectionFromSequence2,
        (inputElement, elementFromMatchingCollection) => new { inputElement.elementFromSequence1, elementFromMatchingCollection });

    result.ToList().ForEach(Console.WriteLine);
}

Note: The output of running Foo3 is: 注意:运行Foo3的输出是:

joinResult:
{ n1 = 12, gn2 = System.Linq.Lookup`2+Grouping[System.Int32,System.String] }
{ n1 = 34, gn2 = System.Linq.Lookup`2+Grouping[System.Int32,System.String] }
{ n1 = 567, gn2 = System.Linq.Lookup`2+Grouping[System.Int32,System.String] }
result:
{ n1 = 12, n2 = ab }
{ n1 = 12, n2 = cd }
{ n1 = 34, n2 = ab }
{ n1 = 34, n2 = cd }
{ n1 = 567, n2 = efg }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM