简体   繁体   English

LINQ - 全外连接

[英]LINQ - Full Outer Join

I have a list of people's ID and their first name, and a list of people's ID and their surname.我有一个人的 ID 和他们的名字的列表,以及一个人的 ID 和他们的姓氏的列表。 Some people don't have a first name and some don't have a surname;有些人没有名字,有些人没有姓氏; I'd like to do a full outer join on the two lists.我想对两个列表进行完整的外部联接。

So the following lists:因此,以下列表:

ID  FirstName
--  ---------
 1  John
 2  Sue

ID  LastName
--  --------
 1  Doe
 3  Smith

Should produce:应该产生:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue
 3             Smith

I'm new to LINQ (so forgive me if I'm being lame) and have found quite a few solutions for 'LINQ Outer Joins' which all look quite similar, but really seem to be left outer joins.我是 LINQ 的新手(所以请原谅我的跛脚)并且已经找到了很多“LINQ 外连接”的解决方案,它们看起来都非常相似,但实际上似乎是左外连接。

My attempts so far go something like this:到目前为止,我的尝试是这样的:

private void OuterJoinTest()
{
    List<FirstName> firstNames = new List<FirstName>();
    firstNames.Add(new FirstName { ID = 1, Name = "John" });
    firstNames.Add(new FirstName { ID = 2, Name = "Sue" });

    List<LastName> lastNames = new List<LastName>();
    lastNames.Add(new LastName { ID = 1, Name = "Doe" });
    lastNames.Add(new LastName { ID = 3, Name = "Smith" });

    var outerJoin = from first in firstNames
        join last in lastNames
        on first.ID equals last.ID
        into temp
        from last in temp.DefaultIfEmpty()
        select new
        {
            id = first != null ? first.ID : last.ID,
            firstname = first != null ? first.Name : string.Empty,
            surname = last != null ? last.Name : string.Empty
        };
    }
}

public class FirstName
{
    public int ID;

    public string Name;
}

public class LastName
{
    public int ID;

    public string Name;
}

But this returns:但这会返回:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue

What am I doing wrong?我究竟做错了什么?

Update 1: providing a truly generalized extension method FullOuterJoin更新一:提供真正通用的扩展方法FullOuterJoin
Update 2: optionally accepting a custom IEqualityComparer for the key type更新 2:可选地接受键类型的自定义IEqualityComparer
Update 3 : this implementation has recently become part of MoreLinq - Thanks guys!更新 3 :此实现最近成为MoreLinq一部分-谢谢大家!

Edit Added FullOuterGroupJoin ( ideone ).编辑添加FullOuterGroupJoin ( ideone )。 I reused the GetOuter<> implementation, making this a fraction less performant than it could be, but I'm aiming for 'highlevel' code, not bleeding-edge optimized, right now.我重用了GetOuter<>实现,这使得它的性能比它可能的要低一些,但我现在的目标是“高级”代码,而不是前沿优化。

See it live on http://ideone.com/O36nWchttp://ideone.com/O36nWc 上实时查看

static void Main(string[] args)
{
    var ax = new[] { 
        new { id = 1, name = "John" },
        new { id = 2, name = "Sue" } };
    var bx = new[] { 
        new { id = 1, surname = "Doe" },
        new { id = 3, surname = "Smith" } };

    ax.FullOuterJoin(bx, a => a.id, b => b.id, (a, b, id) => new {a, b})
        .ToList().ForEach(Console.WriteLine);
}

Prints the output:打印输出:

{ a = { id = 1, name = John }, b = { id = 1, surname = Doe } }
{ a = { id = 2, name = Sue }, b =  }
{ a = , b = { id = 3, surname = Smith } }

You could also supply defaults: http://ideone.com/kG4kqO您还可以提供默认值: http : //ideone.com/kG4kqO

    ax.FullOuterJoin(
            bx, a => a.id, b => b.id, 
            (a, b, id) => new { a.name, b.surname },
            new { id = -1, name    = "(no firstname)" },
            new { id = -2, surname = "(no surname)" }
        )

Printing:印刷:

{ name = John, surname = Doe }
{ name = Sue, surname = (no surname) }
{ name = (no firstname), surname = Smith }

Explanation of terms used:所用术语的解释:

Joining is a term borrowed from relational database design:联接是从关系数据库设计中借来的一个术语:

  • A join will repeat elements from a as many times as there are elements in b with corresponding key (ie: nothing if b were empty).连接将重复a中的元素,次数与b具有相应键的元素一样多(即:如果b为空,则没有任何内容)。 Database lingo calls this inner (equi)join .数据库术语将此称为inner (equi)join
  • An outer join includes elements from a for which no corresponding element exists in b .外连接包括来自a元素,而b存在对应的元素 (ie: even results if b were empty). (即:即使b是空的结果)。 This is usually referred to as left join .这通常称为left join
  • A full outer join includes records from a as well as b if no corresponding element exists in the other.如果另一个中存在相应的元素,则完全外连接包括来自ab记录。 (ie even results if a were empty) (即,如果a为空,即使结果也是如此)

Something not usually seen in RDBMS is a group join [1] :东西通常不会在RDBMS看到的是一组加入[1]:

  • A group join , does the same as described above, but instead of repeating elements from a for multiple corresponding b , it groups the records with corresponding keys. group join的作用与上述相同,不是为多个对应的b重复来自a元素,而是将记录与相应的键分组 This is often more convenient when you wish to enumerate through 'joined' records, based on a common key.当您希望基于公共键枚举“连接”记录时,这通常更方便。

See also GroupJoin which contains some general background explanations as well.另请参阅GroupJoin ,其中还包含一些一般背景说明。


[1] (I believe Oracle and MSSQL have proprietary extensions for this) [1] (我相信 Oracle 和 MSSQL 对此有专有扩展)

Full code完整代码

A generalized 'drop-in' Extension class for this一个通用的“插入”扩展类

internal static class MyExtensions
{
    internal static IEnumerable<TResult> FullOuterGroupJoin<TA, TB, TKey, TResult>(
        this IEnumerable<TA> a,
        IEnumerable<TB> b,
        Func<TA, TKey> selectKeyA, 
        Func<TB, TKey> selectKeyB,
        Func<IEnumerable<TA>, IEnumerable<TB>, TKey, TResult> projection,
        IEqualityComparer<TKey> cmp = null)
    {
        cmp = cmp?? EqualityComparer<TKey>.Default;
        var alookup = a.ToLookup(selectKeyA, cmp);
        var blookup = b.ToLookup(selectKeyB, cmp);

        var keys = new HashSet<TKey>(alookup.Select(p => p.Key), cmp);
        keys.UnionWith(blookup.Select(p => p.Key));

        var join = from key in keys
                   let xa = alookup[key]
                   let xb = blookup[key]
                   select projection(xa, xb, key);

        return join;
    }

    internal static IEnumerable<TResult> FullOuterJoin<TA, TB, TKey, TResult>(
        this IEnumerable<TA> a,
        IEnumerable<TB> b,
        Func<TA, TKey> selectKeyA, 
        Func<TB, TKey> selectKeyB,
        Func<TA, TB, TKey, TResult> projection,
        TA defaultA = default(TA), 
        TB defaultB = default(TB),
        IEqualityComparer<TKey> cmp = null)
    {
        cmp = cmp?? EqualityComparer<TKey>.Default;
        var alookup = a.ToLookup(selectKeyA, cmp);
        var blookup = b.ToLookup(selectKeyB, cmp);

        var keys = new HashSet<TKey>(alookup.Select(p => p.Key), cmp);
        keys.UnionWith(blookup.Select(p => p.Key));

        var join = from key in keys
                   from xa in alookup[key].DefaultIfEmpty(defaultA)
                   from xb in blookup[key].DefaultIfEmpty(defaultB)
                   select projection(xa, xb, key);

        return join;
    }
}

I don't know if this covers all cases, logically it seems correct.我不知道这是否涵盖所有情况,从逻辑上讲似乎是正确的。 The idea is to take a left outer join and right outer join then take the union of the results.这个想法是采用左外连接和右外连接,然后采用结果的并集。

var firstNames = new[]
{
    new { ID = 1, Name = "John" },
    new { ID = 2, Name = "Sue" },
};
var lastNames = new[]
{
    new { ID = 1, Name = "Doe" },
    new { ID = 3, Name = "Smith" },
};
var leftOuterJoin =
    from first in firstNames
    join last in lastNames on first.ID equals last.ID into temp
    from last in temp.DefaultIfEmpty()
    select new
    {
        first.ID,
        FirstName = first.Name,
        LastName = last?.Name,
    };
var rightOuterJoin =
    from last in lastNames
    join first in firstNames on last.ID equals first.ID into temp
    from first in temp.DefaultIfEmpty()
    select new
    {
        last.ID,
        FirstName = first?.Name,
        LastName = last.Name,
    };
var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin);

This works as written since it is in LINQ to Objects.由于它在 LINQ to Objects 中,因此它按编写的方式工作。 If LINQ to SQL or other, the query processor might not support safe navigation or other operations.如果是 LINQ to SQL 或其他,查询处理器可能不支持安全导航或其他操作。 You'd have to use the conditional operator to conditionally get the values.您必须使用条件运算符来有条件地获取值。

ie, IE,

var leftOuterJoin =
    from first in firstNames
    join last in lastNames on first.ID equals last.ID into temp
    from last in temp.DefaultIfEmpty()
    select new
    {
        first.ID,
        FirstName = first.Name,
        LastName = last != null ? last.Name : default,
    };

I think there are problems with most of these, including the accepted answer, because they don't work well with Linq over IQueryable either due to doing too many server round trips and too much data returns, or doing too much client execution.我认为其中的大多数都存在问题,包括已接受的答案,因为它们无法通过 IQueryable 与 Linq 配合使用,要么是由于服务器往返次数过多和数据返回过多,要么是客户端执行过多。

For IEnumerable I don't like Sehe's answer or similar because it has excessive memory use (a simple 10000000 two list test ran Linqpad out of memory on my 32GB machine).对于 IEnumerable,我不喜欢 Sehe 的答案或类似答案,因为它占用了过多的内存(一个简单的 10000000 两个列表测试在我的 32GB 机器上运行 Linqpad 内存不足)。

Also, most of the others don't actually implement a proper Full Outer Join because they are using a Union with a Right Join instead of Concat with a Right Anti Semi Join, which not only eliminates the duplicate inner join rows from the result, but any proper duplicates that existed originally in the left or right data.此外,大多数其他人实际上并没有实现正确的完全外部联接,因为他们使用带有右联接的联合而不是带有右反半联接的 Concat,这不仅从结果中消除了重复的内联接行,而且原始存在于左侧或右侧数据中的任何正确重复项。

So here are my extensions that handle all of these issues, generate SQL as well as implementing the join in LINQ to SQL directly, executing on the server, and is faster and with less memory than others on Enumerables:所以这里是我的扩展,它们处理所有这些问题,生成 SQL 以及直接在 LINQ to SQL 中实现连接,在服务器上执行,并且比 Enumerables 上的其他扩展更快且内存更少:

public static class Ext {
    public static IEnumerable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        return from left in leftItems
               join right in rightItems on leftKeySelector(left) equals rightKeySelector(right) into temp
               from right in temp.DefaultIfEmpty()
               select resultSelector(left, right);
    }

    public static IEnumerable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        return from right in rightItems
               join left in leftItems on rightKeySelector(right) equals leftKeySelector(left) into temp
               from left in temp.DefaultIfEmpty()
               select resultSelector(left, right);
    }

    public static IEnumerable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }

    public static IEnumerable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        var hashLK = new HashSet<TKey>(from l in leftItems select leftKeySelector(l));
        return rightItems.Where(r => !hashLK.Contains(rightKeySelector(r))).Select(r => resultSelector(default(TLeft),r));
    }

    public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector)  where TLeft : class {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }

    private static Expression<Func<TP, TC, TResult>> CastSMBody<TP, TC, TResult>(LambdaExpression ex, TP unusedP, TC unusedC, TResult unusedRes) => (Expression<Func<TP, TC, TResult>>)ex;

    public static IQueryable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        var sampleAnonLR = new { left = default(TLeft), rightg = default(IEnumerable<TRight>) };
        var parmP = Expression.Parameter(sampleAnonLR.GetType(), "p");
        var parmC = Expression.Parameter(typeof(TRight), "c");
        var argLeft = Expression.PropertyOrField(parmP, "left");
        var newleftrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, parmC), parmP, parmC), sampleAnonLR, default(TRight), default(TResult));

        return leftItems.AsQueryable().GroupJoin(rightItems, leftKeySelector, rightKeySelector, (left, rightg) => new { left, rightg }).SelectMany(r => r.rightg.DefaultIfEmpty(), newleftrs);
    }

    public static IQueryable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        var sampleAnonLR = new { leftg = default(IEnumerable<TLeft>), right = default(TRight) };
        var parmP = Expression.Parameter(sampleAnonLR.GetType(), "p");
        var parmC = Expression.Parameter(typeof(TLeft), "c");
        var argRight = Expression.PropertyOrField(parmP, "right");
        var newrightrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, parmC, argRight), parmP, parmC), sampleAnonLR, default(TLeft), default(TResult));

        return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).SelectMany(l => l.leftg.DefaultIfEmpty(), newrightrs);
    }

    public static IQueryable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }

    private static Expression<Func<TP, TResult>> CastSBody<TP, TResult>(LambdaExpression ex, TP unusedP, TResult unusedRes) => (Expression<Func<TP, TResult>>)ex;

    public static IQueryable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        var sampleAnonLgR = new { leftg = default(IEnumerable<TLeft>), right = default(TRight) };
        var parmLgR = Expression.Parameter(sampleAnonLgR.GetType(), "lgr");
        var argLeft = Expression.Constant(default(TLeft), typeof(TLeft));
        var argRight = Expression.PropertyOrField(parmLgR, "right");
        var newrightrs = CastSBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, argRight), parmLgR), sampleAnonLgR, default(TResult));

        return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).Where(lgr => !lgr.leftg.Any()).Select(newrightrs);
    }

    public static IQueryable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }
}

The difference between a Right Anti-Semi-Join is mostly moot with Linq to Objects or in the source, but makes a difference on the server (SQL) side in the final answer, removing an unnecessary JOIN . Right Anti-Semi-Join 之间的区别在 Linq to Objects 或源代码中大多没有实际意义,但在最终答案中的服务器 (SQL) 端有所不同,删除了不必要的JOIN

The hand coding of Expression to handle merging an Expression<Func<>> into a lambda could be improved with LinqKit, but it would be nice if the language/compiler had added some help for that.使用 LinqKit 可以改进用于处理将Expression<Func<>>合并为 lambda 的Expression的手动编码,但如果语言/编译器为此添加了一些帮助,那就太好了。 The FullOuterJoinDistinct and RightOuterJoin functions are included for completeness, but I did not re-implement FullOuterGroupJoin yet.为了完整FullOuterJoinDistinct ,包含了FullOuterJoinDistinctRightOuterJoin函数,但我还没有重新实现FullOuterGroupJoin

I wrote another version of a full outer join for IEnumerable for cases where the key is orderable, which is about 50% faster than combining the left outer join with the right anti semi join, at least on small collections.我为IEnumerable编写了另一个版本的完整外连接,用于键可排序的情况,这比将左外连接与右反半连接组合快约 50%,至少在小集合上。 It goes through each collection after sorting just once.它在排序后遍历每个集合。

I also added another answer for a version that works with EF by replacing the Invoke with a custom expansion.我还通过用自定义扩展替换Invoke为与 EF 一起使用的版本添加了另一个答案

Here is an extension method doing that:这是一个扩展方法:

public static IEnumerable<KeyValuePair<TLeft, TRight>> FullOuterJoin<TLeft, TRight>(this IEnumerable<TLeft> leftItems, Func<TLeft, object> leftIdSelector, IEnumerable<TRight> rightItems, Func<TRight, object> rightIdSelector)
{
    var leftOuterJoin = from left in leftItems
        join right in rightItems on leftIdSelector(left) equals rightIdSelector(right) into temp
        from right in temp.DefaultIfEmpty()
        select new { left, right };

    var rightOuterJoin = from right in rightItems
        join left in leftItems on rightIdSelector(right) equals leftIdSelector(left) into temp
        from left in temp.DefaultIfEmpty()
        select new { left, right };

    var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin);

    return fullOuterJoin.Select(x => new KeyValuePair<TLeft, TRight>(x.left, x.right));
}

I'm guessing @sehe's approach is stronger, but until I understand it better, I find myself leap-frogging off of @MichaelSander's extension.我猜@sehe 的方法更强大,但在我更好地理解它之前,我发现自己跳过了@MichaelSander 的扩展。 I modified it to match the syntax and return type of the built-in Enumerable.Join() method described here .我修改了它以匹配此处描述的内置 Enumerable.Join() 方法的语法和返回类型。 I appended the "distinct" suffix in respect to @cadrell0's comment under @JeffMercado's solution.我在@JeffMercado 的解决方案下为@cadrell0 的评论附加了“distinct”后缀。

public static class MyExtensions {

    public static IEnumerable<TResult> FullJoinDistinct<TLeft, TRight, TKey, TResult> (
        this IEnumerable<TLeft> leftItems, 
        IEnumerable<TRight> rightItems, 
        Func<TLeft, TKey> leftKeySelector, 
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector
    ) {

        var leftJoin = 
            from left in leftItems
            join right in rightItems 
              on leftKeySelector(left) equals rightKeySelector(right) into temp
            from right in temp.DefaultIfEmpty()
            select resultSelector(left, right);

        var rightJoin = 
            from right in rightItems
            join left in leftItems 
              on rightKeySelector(right) equals leftKeySelector(left) into temp
            from left in temp.DefaultIfEmpty()
            select resultSelector(left, right);

        return leftJoin.Union(rightJoin);
    }

}

In the example, you would use it like this:在示例中,您将像这样使用它:

var test = 
    firstNames
    .FullJoinDistinct(
        lastNames,
        f=> f.ID,
        j=> j.ID,
        (f,j)=> new {
            ID = f == null ? j.ID : f.ID, 
            leftName = f == null ? null : f.Name,
            rightName = j == null ? null : j.Name
        }
    );

In the future, as I learn more, I have a feeling I'll be migrating to @sehe's logic given it's popularity.将来,随着我了解的更多,我有一种感觉,鉴于它的受欢迎程度,我将迁移到@sehe 的逻辑。 But even then I'll have to be careful, because I feel it is important to have at least one overload that matches the syntax of the existing ".Join()" method if feasible, for two reasons:但即便如此,我也必须小心,因为我觉得如果可行,至少有一个与现有“.Join()”方法的语法相匹配的重载很重要,原因有二:

  1. Consistency in methods helps save time, avoid errors, and avoid unintended behavior.方法的一致性有助于节省时间、避免错误和避免意外行为。
  2. If there ever is an out-of-the-box ".FullJoin()" method in the future, I would imagine it will try to keep to the syntax of the currently existing ".Join()" method if it can.如果将来有开箱即用的“.FullJoin()”方法,我想它会尽量保持当前存在的“.Join()”方法的语法,如果可以的话。 If it does, then if you want to migrate to it, you can simply rename your functions without changing the parameters or worrying about different return types breaking your code.如果是,那么如果您想迁移到它,您可以简单地重命名您的函数,而无需更改参数或担心不同的返回类型会破坏您的代码。

I'm still new with generics, extensions, Func statements, and other features, so feedback is certainly welcome.我对泛型、扩展、Func 语句和其他功能仍然很陌生,因此当然欢迎反馈。

EDIT: Didn't take me long to realize there was a problem with my code.编辑:没多久我就意识到我的代码有问题。 I was doing a .Dump() in LINQPad and looking at the return type.我正在 LINQPad 中执行 .Dump() 并查看返回类型。 It was just IEnumerable, so I tried to match it.它只是 IEnumerable,所以我试图匹配它。 But when I actually did a .Where() or .Select() on my extension I got an error: "'System Collections.IEnumerable' does not contain a definition for 'Select' and ...".但是当我在我的扩展程序上实际执行 .Where() 或 .Select() 时,我收到一个错误:“'System Collections.IEnumerable' 不包含'Select' 和...的定义”。 So in the end I was able to match the input syntax of .Join(), but not the return behavior.所以最后我能够匹配 .Join() 的输入语法,但不能匹配返回行为。

EDIT: Added "TResult" to the return type for the function.编辑:在函数的返回类型中添加了“TResult”。 Missed that when reading the Microsoft article, and of course it makes sense.在阅读 Microsoft 文章时错过了这一点,当然这是有道理的。 With this fix, it now seems the return behavior is in line with my goals after all.通过此修复程序,现在看来返回行为毕竟符合我的目标。

As you've found, Linq doesn't have an "outer join" construct.正如您所发现的,Linq 没有“外连接”结构。 The closest you can get is a left outer join using the query you stated.您可以获得的最接近的是使用您陈述的查询的左外连接。 To this, you can add any elements of the lastname list that aren't represented in the join:为此,您可以添加姓氏列表中未在联接中表示的任何元素:

outerJoin = outerJoin.Concat(lastNames.Select(l=>new
                            {
                                id = l.ID,
                                firstname = String.Empty,
                                surname = l.Name
                            }).Where(l=>!outerJoin.Any(o=>o.id == l.id)));

I like sehe's answer, but it does not use deferred execution (the input sequences are eagerly enumerated by the calls to ToLookup).我喜欢 sehe 的答案,但它不使用延迟执行(输入序列通过对 ToLookup 的调用急切地枚举)。 So after looking at the .NET sources for LINQ-to-objects , I came up with this:因此,在查看了LINQ-to-objects的 .NET 源代码后,我想出了这个:

public static class LinqExtensions
{
    public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> left,
        IEnumerable<TRight> right,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TKey, TResult> resultSelector,
        IEqualityComparer<TKey> comparator = null,
        TLeft defaultLeft = default(TLeft),
        TRight defaultRight = default(TRight))
    {
        if (left == null) throw new ArgumentNullException("left");
        if (right == null) throw new ArgumentNullException("right");
        if (leftKeySelector == null) throw new ArgumentNullException("leftKeySelector");
        if (rightKeySelector == null) throw new ArgumentNullException("rightKeySelector");
        if (resultSelector == null) throw new ArgumentNullException("resultSelector");

        comparator = comparator ?? EqualityComparer<TKey>.Default;
        return FullOuterJoinIterator(left, right, leftKeySelector, rightKeySelector, resultSelector, comparator, defaultLeft, defaultRight);
    }

    internal static IEnumerable<TResult> FullOuterJoinIterator<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> left,
        IEnumerable<TRight> right,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TKey, TResult> resultSelector,
        IEqualityComparer<TKey> comparator,
        TLeft defaultLeft,
        TRight defaultRight)
    {
        var leftLookup = left.ToLookup(leftKeySelector, comparator);
        var rightLookup = right.ToLookup(rightKeySelector, comparator);
        var keys = leftLookup.Select(g => g.Key).Union(rightLookup.Select(g => g.Key), comparator);

        foreach (var key in keys)
            foreach (var leftValue in leftLookup[key].DefaultIfEmpty(defaultLeft))
                foreach (var rightValue in rightLookup[key].DefaultIfEmpty(defaultRight))
                    yield return resultSelector(leftValue, rightValue, key);
    }
}

This implementation has the following important properties:此实现具有以下重要属性:

  • Deferred execution, input sequences will not be enumerated before the output sequence is enumerated.延迟执行,在枚举输出序列之前不会枚举输入序列。
  • Only enumerates the input sequences once each.每个输入序列仅枚举一次。
  • Preserves order of input sequences, in the sense that it will yield tuples in the order of the left sequence and then the right (for the keys not present in left sequence).保留输入序列的顺序,从某种意义上说,它将按照左序列然后是右序列的顺序生成元组(对于不存在于左序列中的键)。

These properties are important, because they are what someone new to FullOuterJoin but experienced with LINQ will expect.这些属性很重要,因为它们是 FullOuterJoin 新手但熟悉 LINQ 的人所期望的。

I decided to add this as a separate answer as I am not positive it is tested enough.我决定将此作为单独的答案添加,因为我不确定它是否经过足够的测试。 This is a re-implementation of the FullOuterJoin method using essentially a simplified, customized version of LINQKit Invoke / Expand for Expression so that it should work the Entity Framework.这是FullOuterJoin方法的重新实现,本质上使用简化的自定义版本的LINQKit Invoke / Expand for Expression以便它可以在实体框架中工作。 There's not much explanation as it is pretty much the same as my previous answer.没有太多解释,因为它与我之前的答案几乎相同。

public static class Ext {
    private static Expression<Func<TP, TC, TResult>> CastSMBody<TP, TC, TResult>(LambdaExpression ex, TP unusedP, TC unusedC, TResult unusedRes) => (Expression<Func<TP, TC, TResult>>)ex;

    public static IQueryable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        // (lrg,r) => resultSelector(lrg.left, r)
        var sampleAnonLR = new { left = default(TLeft), rightg = default(IEnumerable<TRight>) };
        var parmP = Expression.Parameter(sampleAnonLR.GetType(), "lrg");
        var parmC = Expression.Parameter(typeof(TRight), "r");
        var argLeft = Expression.PropertyOrField(parmP, "left");
        var newleftrs = CastSMBody(Expression.Lambda(resultSelector.Apply(argLeft, parmC), parmP, parmC), sampleAnonLR, default(TRight), default(TResult));

        return leftItems.GroupJoin(rightItems, leftKeySelector, rightKeySelector, (left, rightg) => new { left, rightg }).SelectMany(r => r.rightg.DefaultIfEmpty(), newleftrs);
    }

    public static IQueryable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        // (lgr,l) => resultSelector(l, lgr.right)
        var sampleAnonLR = new { leftg = default(IEnumerable<TLeft>), right = default(TRight) };
        var parmP = Expression.Parameter(sampleAnonLR.GetType(), "lgr");
        var parmC = Expression.Parameter(typeof(TLeft), "l");
        var argRight = Expression.PropertyOrField(parmP, "right");
        var newrightrs = CastSMBody(Expression.Lambda(resultSelector.Apply(parmC, argRight), parmP, parmC), sampleAnonLR, default(TLeft), default(TResult));

        return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right })
                         .SelectMany(l => l.leftg.DefaultIfEmpty(), newrightrs);
    }

    private static Expression<Func<TParm, TResult>> CastSBody<TParm, TResult>(LambdaExpression ex, TParm unusedP, TResult unusedRes) => (Expression<Func<TParm, TResult>>)ex;

    public static IQueryable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class {

        // newrightrs = lgr => resultSelector(default(TLeft), lgr.right)
        var sampleAnonLgR = new { leftg = (IEnumerable<TLeft>)null, right = default(TRight) };
        var parmLgR = Expression.Parameter(sampleAnonLgR.GetType(), "lgr");
        var argLeft = Expression.Constant(default(TLeft), typeof(TLeft));
        var argRight = Expression.PropertyOrField(parmLgR, "right");
        var newrightrs = CastSBody(Expression.Lambda(resultSelector.Apply(argLeft, argRight), parmLgR), sampleAnonLgR, default(TResult));

        return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).Where(lgr => !lgr.leftg.Any()).Select(newrightrs);
    }

    public static IQueryable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector)  where TLeft : class where TRight : class where TResult : class {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }

    public static Expression Apply(this LambdaExpression e, params Expression[] args) {
        var b = e.Body;

        foreach (var pa in e.Parameters.Cast<ParameterExpression>().Zip(args, (p, a) => (p, a))) {
            b = b.Replace(pa.p, pa.a);
        }

        return b.PropagateNull();
    }

    public static Expression Replace(this Expression orig, Expression from, Expression to) => new ReplaceVisitor(from, to).Visit(orig);
    public class ReplaceVisitor : System.Linq.Expressions.ExpressionVisitor {
        public readonly Expression from;
        public readonly Expression to;

        public ReplaceVisitor(Expression _from, Expression _to) {
            from = _from;
            to = _to;
        }

        public override Expression Visit(Expression node) => node == from ? to : base.Visit(node);
    }

    public static Expression PropagateNull(this Expression orig) => new NullVisitor().Visit(orig);
    public class NullVisitor : System.Linq.Expressions.ExpressionVisitor {
        public override Expression Visit(Expression node) {
            if (node is MemberExpression nme && nme.Expression is ConstantExpression nce && nce.Value == null)
                return Expression.Constant(null, nce.Type.GetMember(nme.Member.Name).Single().GetMemberType());
            else
                return base.Visit(node);
        }
    }

    public static Type GetMemberType(this MemberInfo member) {
        switch (member) {
            case FieldInfo mfi:
                return mfi.FieldType;
            case PropertyInfo mpi:
                return mpi.PropertyType;
            case EventInfo mei:
                return mei.EventHandlerType;
            default:
                throw new ArgumentException("MemberInfo must be if type FieldInfo, PropertyInfo or EventInfo", nameof(member));
        }
    }
}

Performs a in-memory streaming enumeration over both inputs and invokes the selector for each row.对两个输入执行内存中流枚举并为每一行调用选择器。 If there is no correlation at the current iteration, one of the selector arguments will be null .如果当前迭代中没有相关性,则选择器参数之一将为 null

Example:例子:

   var result = left.FullOuterJoin(
         right, 
         x=>left.Key, 
         x=>right.Key, 
         (l,r) => new { LeftKey = l?.Key, RightKey=r?.Key });
  • Requires an IComparer for the correlation type, uses the Comparer.Default if not provided.需要相关类型的 IComparer,如果未提供,则使用 Comparer.Default。

  • Requires that 'OrderBy' is applied to the input enumerables要求将“OrderBy”应用于输入可枚举项

    /// <summary> /// Performs a full outer join on two <see cref="IEnumerable{T}" />. /// </summary> /// <typeparam name="TLeft"></typeparam> /// <typeparam name="TValue"></typeparam> /// <typeparam name="TRight"></typeparam> /// <typeparam name="TResult"></typeparam> /// <param name="left"></param> /// <param name="right"></param> /// <param name="leftKeySelector"></param> /// <param name="rightKeySelector"></param> /// <param name="selector">Expression defining result type</param> /// <param name="keyComparer">A comparer if there is no default for the type</param> /// <returns></returns> [System.Diagnostics.DebuggerStepThrough] public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TValue, TResult>( this IEnumerable<TLeft> left, IEnumerable<TRight> right, Func<TLeft, TValue> leftKeySelector, Func<TRight, TValue> rightKeySelector, Func<TLeft, TRight, TResult> selector, IComparer<TValue> keyComparer = null) where TLeft: class where TRight: class where TValue : IComparable { keyComparer = keyComparer ?? Comparer<TValue>.Default; using (var enumLeft = left.OrderBy(leftKeySelector).GetEnumerator()) using (var enumRight = right.OrderBy(rightKeySelector).GetEnumerator()) { var hasLeft = enumLeft.MoveNext(); var hasRight = enumRight.MoveNext(); while (hasLeft || hasRight) { var currentLeft = enumLeft.Current; var valueLeft = hasLeft ? leftKeySelector(currentLeft) : default(TValue); var currentRight = enumRight.Current; var valueRight = hasRight ? rightKeySelector(currentRight) : default(TValue); int compare = !hasLeft ? 1 : !hasRight ? -1 : keyComparer.Compare(valueLeft, valueRight); switch (compare) { case 0: // The selector matches. An inner join is achieved yield return selector(currentLeft, currentRight); hasLeft = enumLeft.MoveNext(); hasRight = enumRight.MoveNext(); break; case -1: yield return selector(currentLeft, default(TRight)); hasLeft = enumLeft.MoveNext(); break; case 1: yield return selector(default(TLeft), currentRight); hasRight = enumRight.MoveNext(); break; } } } }

My clean solution for situation that key is unique in both enumerables:对于 key 在两个枚举中都是唯一的情况,我的干净解决方案:

 private static IEnumerable<TResult> FullOuterJoin<Ta, Tb, TKey, TResult>(
            IEnumerable<Ta> a, IEnumerable<Tb> b,
            Func<Ta, TKey> key_a, Func<Tb, TKey> key_b,
            Func<Ta, Tb, TResult> selector)
        {
            var alookup = a.ToLookup(key_a);
            var blookup = b.ToLookup(key_b);
            var keys = new HashSet<TKey>(alookup.Select(p => p.Key));
            keys.UnionWith(blookup.Select(p => p.Key));
            return keys.Select(key => selector(alookup[key].FirstOrDefault(), blookup[key].FirstOrDefault()));
        }

so所以

    var ax = new[] {
        new { id = 1, first_name = "ali" },
        new { id = 2, first_name = "mohammad" } };
    var bx = new[] {
        new { id = 1, last_name = "rezaei" },
        new { id = 3, last_name = "kazemi" } };

    var list = FullOuterJoin(ax, bx, a => a.id, b => b.id, (a, b) => "f: " + a?.first_name + " l: " + b?.last_name).ToArray();

outputs:输出:

f: ali l: rezaei
f: mohammad l:
f:  l: kazemi

I've written this extensions class for an app perhaps 6 years ago, and have been using it ever since in many solutions without issues.我大概在 6 年前为一个应用程序编写了这个扩展类,并且从那时起一直在许多解决方案中使用它,没有出现问题。 Hope it helps.希望能帮助到你。

edit: I noticed some might not know how to use an extension class.编辑:我注意到有些人可能不知道如何使用扩展类。

To use this extension class, just reference its namespace in your class by adding the following line using joinext;要使用这个扩展类,只需通过使用 joinext 添加以下行来引用它在类中的命名空间;

^ this should allow you to to see the intellisense of extension functions on any IEnumerable object collection you happen to use. ^ 这应该允许您在您碰巧使用的任何 IEnumerable 对象集合上查看扩展函数的智能感知。

Hope this helps.希望这可以帮助。 Let me know if it's still not clear, and I'll hopefully write a sample example on how to use it.如果它仍然不清楚,请告诉我,我希望写一个示例来说明如何使用它。

Now here is the class:现在这里是类:

namespace joinext
{    
public static class JoinExtensions
    {
        public static IEnumerable<TResult> FullOuterJoin<TOuter, TInner, TKey, TResult>(
            this IEnumerable<TOuter> outer,
            IEnumerable<TInner> inner,
            Func<TOuter, TKey> outerKeySelector,
            Func<TInner, TKey> innerKeySelector,
            Func<TOuter, TInner, TResult> resultSelector)
            where TInner : class
            where TOuter : class
        {
            var innerLookup = inner.ToLookup(innerKeySelector);
            var outerLookup = outer.ToLookup(outerKeySelector);

            var innerJoinItems = inner
                .Where(innerItem => !outerLookup.Contains(innerKeySelector(innerItem)))
                .Select(innerItem => resultSelector(null, innerItem));

            return outer
                .SelectMany(outerItem =>
                {
                    var innerItems = innerLookup[outerKeySelector(outerItem)];

                    return innerItems.Any() ? innerItems : new TInner[] { null };
                }, resultSelector)
                .Concat(innerJoinItems);
        }


        public static IEnumerable<TResult> LeftJoin<TOuter, TInner, TKey, TResult>(
            this IEnumerable<TOuter> outer,
            IEnumerable<TInner> inner,
            Func<TOuter, TKey> outerKeySelector,
            Func<TInner, TKey> innerKeySelector,
            Func<TOuter, TInner, TResult> resultSelector)
        {
            return outer.GroupJoin(
                inner,
                outerKeySelector,
                innerKeySelector,
                (o, i) =>
                    new { o = o, i = i.DefaultIfEmpty() })
                    .SelectMany(m => m.i.Select(inn =>
                        resultSelector(m.o, inn)
                        ));

        }



        public static IEnumerable<TResult> RightJoin<TOuter, TInner, TKey, TResult>(
            this IEnumerable<TOuter> outer,
            IEnumerable<TInner> inner,
            Func<TOuter, TKey> outerKeySelector,
            Func<TInner, TKey> innerKeySelector,
            Func<TOuter, TInner, TResult> resultSelector)
        {
            return inner.GroupJoin(
                outer,
                innerKeySelector,
                outerKeySelector,
                (i, o) =>
                    new { i = i, o = o.DefaultIfEmpty() })
                    .SelectMany(m => m.o.Select(outt =>
                        resultSelector(outt, m.i)
                        ));

        }

    }
}

Full outer join for two or more tables: First extract the column that you want to join on.两个或多个表的全外连接:首先提取要连接的列。

var DatesA = from A in db.T1 select A.Date; 
var DatesB = from B in db.T2 select B.Date; 
var DatesC = from C in db.T3 select C.Date;            

var Dates = DatesA.Union(DatesB).Union(DatesC); 

Then use left outer join between the extracted column and main tables.然后在提取的列和主表之间使用左外连接。

var Full_Outer_Join =

(from A in Dates
join B in db.T1
on A equals B.Date into AB 

from ab in AB.DefaultIfEmpty()
join C in db.T2
on A equals C.Date into ABC 

from abc in ABC.DefaultIfEmpty()
join D in db.T3
on A equals D.Date into ABCD

from abcd in ABCD.DefaultIfEmpty() 
select new { A, ab, abc, abcd })
.AsEnumerable();

I think that LINQ join clause isn't the correct solution to this problem, because of join clause purpose isn't to accumulate data in such way as required for this task solution.我认为 LINQ join 子句不是这个问题的正确解决方案,因为 join 子句的目的不是以这种任务解决方案所需的方式积累数据。 The code to merge created separate collections becomes too complicated, maybe it is OK for learning purposes, but not for real applications.合并创建的单独集合的代码变得太复杂了,也许用于学习目的是可以的,但不适用于实际应用。 One of the ways how to solve this problem is in the code below:如何解决这个问题的方法之一是在下面的代码中:

class Program
{
    static void Main(string[] args)
    {
        List<FirstName> firstNames = new List<FirstName>();
        firstNames.Add(new FirstName { ID = 1, Name = "John" });
        firstNames.Add(new FirstName { ID = 2, Name = "Sue" });

        List<LastName> lastNames = new List<LastName>();
        lastNames.Add(new LastName { ID = 1, Name = "Doe" });
        lastNames.Add(new LastName { ID = 3, Name = "Smith" });

        HashSet<int> ids = new HashSet<int>();
        foreach (var name in firstNames)
        {
            ids.Add(name.ID);
        }
        foreach (var name in lastNames)
        {
            ids.Add(name.ID);
        }
        List<FullName> fullNames = new List<FullName>();
        foreach (int id in ids)
        {
            FullName fullName = new FullName();
            fullName.ID = id;
            FirstName firstName = firstNames.Find(f => f.ID == id);
            fullName.FirstName = firstName != null ? firstName.Name : string.Empty;
            LastName lastName = lastNames.Find(l => l.ID == id);
            fullName.LastName = lastName != null ? lastName.Name : string.Empty;
            fullNames.Add(fullName);
        }
    }
}
public class FirstName
{
    public int ID;

    public string Name;
}

public class LastName
{
    public int ID;

    public string Name;
}
class FullName
{
    public int ID;

    public string FirstName;

    public string LastName;
}

If real collections are large for HashSet formation instead foreach loops can be used the code below:如果 HashSet 形成的实际集合很大,则可以使用以下代码代替 foreach 循环:

List<int> firstIds = firstNames.Select(f => f.ID).ToList();
List<int> LastIds = lastNames.Select(l => l.ID).ToList();
HashSet<int> ids = new HashSet<int>(firstIds.Union(LastIds));//Only unique IDs will be included in HashSet

Thank You everybody for the interesting posts!谢谢大家的有趣帖子!

I modified the code because in my case I needed我修改了代码,因为在我的情况下我需要

  • a personalized join predicate个性化的连接谓词
  • a personalized union distinct comparer个性化的联合独特比较器

For the ones interested this is my modified code (in VB, sorry)对于感兴趣的人,这是我修改后的代码(在 VB 中,抱歉)

    Module MyExtensions
        <Extension()>
        Friend Function FullOuterJoin(Of TA, TB, TResult)(ByVal a As IEnumerable(Of TA), ByVal b As IEnumerable(Of TB), ByVal joinPredicate As Func(Of TA, TB, Boolean), ByVal projection As Func(Of TA, TB, TResult), ByVal comparer As IEqualityComparer(Of TResult)) As IEnumerable(Of TResult)
            Dim joinL =
                From xa In a
                From xb In b.Where(Function(x) joinPredicate(xa, x)).DefaultIfEmpty()
                Select projection(xa, xb)
            Dim joinR =
                From xb In b
                From xa In a.Where(Function(x) joinPredicate(x, xb)).DefaultIfEmpty()
                Select projection(xa, xb)
            Return joinL.Union(joinR, comparer)
        End Function
    End Module

    Dim fullOuterJoin = lefts.FullOuterJoin(
        rights,
        Function(left, right) left.Code = right.Code And (left.Amount [...] Or left.Description.Contains [...]),
        Function(left, right) New CompareResult(left, right),
        New MyEqualityComparer
    )

    Public Class MyEqualityComparer
        Implements IEqualityComparer(Of CompareResult)

        Private Function GetMsg(obj As CompareResult) As String
            Dim msg As String = ""
            msg &= obj.Code & "_"
            [...]
            Return msg
        End Function

        Public Overloads Function Equals(x As CompareResult, y As CompareResult) As Boolean Implements IEqualityComparer(Of CompareResult).Equals
            Return Me.GetMsg(x) = Me.GetMsg(y)
        End Function

        Public Overloads Function GetHashCode(obj As CompareResult) As Integer Implements IEqualityComparer(Of CompareResult).GetHashCode
            Return Me.GetMsg(obj).GetHashCode
        End Function
    End Class

Yet another full outer join另一个完整的外连接

As was not that happy with the simplicity and the readability of the other propositions, I ended up with this :由于对其他命题的简单性和可读性不太满意,我最终得到了这个:

It does not have the pretension to be fast ( about 800 ms to join 1000 * 1000 on a 2020m CPU : 2.4ghz / 2cores).它没有快速的自负(在 2020m CPU 上加入 1000 * 1000 大约需要 800 毫秒:2.4ghz / 2cores)。 To me, it is just a compact and casual full outer join.对我来说,它只是一个紧凑而随意的全外连接。

It works the same as a SQL FULL OUTER JOIN (duplicates conservation)它的工作原理与 SQL FULL OUTER JOIN 相同(重复保护)

Cheers ;-)干杯;-)

using System;
using System.Collections.Generic;
using System.Linq;
namespace NS
{
public static class DataReunion
{
    public static List<Tuple<T1, T2>> FullJoin<T1, T2, TKey>(List<T1> List1, Func<T1, TKey> KeyFunc1, List<T2> List2, Func<T2, TKey> KeyFunc2)
    {
        List<Tuple<T1, T2>> result = new List<Tuple<T1, T2>>();

        Tuple<TKey, T1>[] identifiedList1 = List1.Select(_ => Tuple.Create(KeyFunc1(_), _)).OrderBy(_ => _.Item1).ToArray();
        Tuple<TKey, T2>[] identifiedList2 = List2.Select(_ => Tuple.Create(KeyFunc2(_), _)).OrderBy(_ => _.Item1).ToArray();

        identifiedList1.Where(_ => !identifiedList2.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
            result.Add(Tuple.Create<T1, T2>(_.Item2, default(T2)));
        });

        result.AddRange(
            identifiedList1.Join(identifiedList2, left => left.Item1, right => right.Item1, (left, right) => Tuple.Create<T1, T2>(left.Item2, right.Item2)).ToList()
        );

        identifiedList2.Where(_ => !identifiedList1.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
            result.Add(Tuple.Create<T1, T2>(default(T1), _.Item2));
        });

        return result;
    }
}
}

The idea is to这个想法是

  1. Build Ids based on provided key function builders基于提供的关键功能构建器的构建 ID
  2. Process left only items处理仅剩的项目
  3. Process inner join进程内连接
  4. Process right only items仅处理正确的项目

Here is a succinct test that goes with it :这是一个与之配套的简洁测试:

Place a break point at the end to manually verify that it behaves as expected在末尾放置一个断点以手动验证它的行为是否符合预期

using System;
using System.Collections.Generic;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using NS;

namespace Tests
{
[TestClass]
public class DataReunionTest
{
    [TestMethod]
    public void Test()
    {
        List<Tuple<Int32, Int32, String>> A = new List<Tuple<Int32, Int32, String>>();
        List<Tuple<Int32, Int32, String>> B = new List<Tuple<Int32, Int32, String>>();

        Random rnd = new Random();

        /* Comment the testing block you do not want to run
        /* Solution to test a wide range of keys*/

        for (int i = 0; i < 500; i += 1)
        {
            A.Add(Tuple.Create(rnd.Next(1, 101), rnd.Next(1, 101), "A"));
            B.Add(Tuple.Create(rnd.Next(1, 101), rnd.Next(1, 101), "B"));
        }

        /* Solution for essential testing*/

        A.Add(Tuple.Create(1, 2, "B11"));
        A.Add(Tuple.Create(1, 2, "B12"));
        A.Add(Tuple.Create(1, 3, "C11"));
        A.Add(Tuple.Create(1, 3, "C12"));
        A.Add(Tuple.Create(1, 3, "C13"));
        A.Add(Tuple.Create(1, 4, "D1"));

        B.Add(Tuple.Create(1, 1, "A21"));
        B.Add(Tuple.Create(1, 1, "A22"));
        B.Add(Tuple.Create(1, 1, "A23"));
        B.Add(Tuple.Create(1, 2, "B21"));
        B.Add(Tuple.Create(1, 2, "B22"));
        B.Add(Tuple.Create(1, 2, "B23"));
        B.Add(Tuple.Create(1, 3, "C2"));
        B.Add(Tuple.Create(1, 5, "E2"));

        Func<Tuple<Int32, Int32, String>, Tuple<Int32, Int32>> key = (_) => Tuple.Create(_.Item1, _.Item2);

        var watch = System.Diagnostics.Stopwatch.StartNew();
        var res = DataReunion.FullJoin(A, key, B, key);
        watch.Stop();
        var elapsedMs = watch.ElapsedMilliseconds;
        String aser = JToken.FromObject(res).ToString(Formatting.Indented);
        Console.Write(elapsedMs);
    }
}

} }

I really hate these linq expressions, this is why SQL exists:我真的很讨厌这些 linq 表达式,这就是 SQL 存在的原因:

select isnull(fn.id, ln.id) as id, fn.firstname, ln.lastname
   from firstnames fn
   full join lastnames ln on ln.id=fn.id

Create this as sql view in database and import it as entity.在数据库中将其创建为 sql 视图并将其作为实体导入。

Of course, (distinct) union of left and right joins will make it too, but it is stupid.当然,左右连接的(不同的)联合也会使它成为,但它是愚蠢的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM