简体   繁体   English

如何使用 linq 扩展方法执行左外连接

[英]How do you perform a left outer join using linq extension methods

Assuming I have a left outer join as such:假设我有一个左外连接:

from f in Foo
join b in Bar on f.Foo_Id equals b.Foo_Id into g
from result in g.DefaultIfEmpty()
select new { Foo = f, Bar = result }

How would I express the same task using extension methods?我将如何使用扩展方法来表达相同的任务? Eg例如

Foo.GroupJoin(Bar, f => f.Foo_Id, b => b.Foo_Id, (f,b) => ???)
    .Select(???)

For a (left outer) join of a table Bar with a table Foo on Foo.Foo_Id = Bar.Foo_Id in lambda notation:对于表Bar与表Foo在 lambda 表示法中的Foo.Foo_Id = Bar.Foo_Id的(左外部)连接:

var qry = Foo.GroupJoin(
          Bar, 
          foo => foo.Foo_Id,
          bar => bar.Foo_Id,
          (x,y) => new { Foo = x, Bars = y })
       .SelectMany(
           x => x.Bars.DefaultIfEmpty(),
           (x,y) => new { Foo=x.Foo, Bar=y});

Since this seems to be the de facto SO question for left outer joins using the method (extension) syntax, I thought I would add an alternative to the currently selected answer that (in my experience at least) has been more commonly what I'm after由于这似乎是使用方法(扩展)语法的左外连接的事实上的 SO 问题,我想我会为当前选择的答案添加一个替代方案(至少在我的经验中)更常见的是我后

// Option 1: Expecting either 0 or 1 matches from the "Right"
// table (Bars in this case):
var qry = Foos.GroupJoin(
          Bars,
          foo => foo.Foo_Id,
          bar => bar.Foo_Id,
          (f,bs) => new { Foo = f, Bar = bs.SingleOrDefault() });

// Option 2: Expecting either 0 or more matches from the "Right" table
// (courtesy of currently selected answer):
var qry = Foos.GroupJoin(
                  Bars, 
                  foo => foo.Foo_Id,
                  bar => bar.Foo_Id,
                  (f,bs) => new { Foo = f, Bars = bs })
              .SelectMany(
                  fooBars => fooBars.Bars.DefaultIfEmpty(),
                  (x,y) => new { Foo = x.Foo, Bar = y });

To display the difference using a simple data set (assuming we're joining on the values themselves):使用简单的数据集显示差异(假设我们正在加入值本身):

List<int> tableA = new List<int> { 1, 2, 3 };
List<int?> tableB = new List<int?> { 3, 4, 5 };

// Result using both Option 1 and 2. Option 1 would be a better choice
// if we didn't expect multiple matches in tableB.
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    }

List<int> tableA = new List<int> { 1, 2, 3 };
List<int?> tableB = new List<int?> { 3, 3, 4 };

// Result using Option 1 would be that an exception gets thrown on
// SingleOrDefault(), but if we use FirstOrDefault() instead to illustrate:
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    } // Misleading, we had multiple matches.
                    // Which 3 should get selected (not arbitrarily the first)?.

// Result using Option 2:
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    }
{ A = 3, B = 3    }    

Option 2 is true to the typical left outer join definition, but as I mentioned earlier is often unnecessarily complex depending on the data set.选项 2 适用于典型的左外连接定义,但正如我前面提到的,根据数据集的不同,它通常会变得不必要地复杂。

Group Join method is unnecessary to achieve joining of two data sets. Group Join 方法对于实现两个数据集的连接是不必要的。

Inner Join:内部联接:

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

For Left Join just add DefaultIfEmpty()对于左连接,只需添加 DefaultIfEmpty()

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id).DefaultIfEmpty(),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

EF and LINQ to SQL correctly transform to SQL. EF 和 LINQ 到 SQL 正确转换为 SQL。 For LINQ to Objects it is beter to join using GroupJoin as it internally uses Lookup .对于 LINQ 到 Objects ,最好使用 GroupJoin 加入,因为它在内部使用 Lookup But if you are querying DB then skipping of GroupJoin is AFAIK as performant.但是,如果您正在查询数据库,那么跳过 GroupJoin 是 AFAIK 的表现。

Personlay for me this way is more readable compared to GroupJoin().SelectMany()与 GroupJoin().SelectMany() 相比,这种方式对我来说 Personlay 更具可读性

You can create extension method like:您可以创建扩展方法,例如:

public static IEnumerable<TResult> LeftOuterJoin<TSource, TInner, TKey, TResult>(this IEnumerable<TSource> source, IEnumerable<TInner> other, Func<TSource, TKey> func, Func<TInner, TKey> innerkey, Func<TSource, TInner, TResult> res)
    {
        return from f in source
               join b in other on func.Invoke(f) equals innerkey.Invoke(b) into g
               from result in g.DefaultIfEmpty()
               select res.Invoke(f, result);
    }

Improving on Ocelot20's answer, if you have a table you're left outer joining with where you just want 0 or 1 rows out of it, but it could have multiple, you need to Order your joined table:改进 Ocelot20 的答案,如果你有一个表,你只需要从外部加入 0 或 1 行,但它可能有多个,你需要订购你的加入表:

var qry = Foos.GroupJoin(
      Bars.OrderByDescending(b => b.Id),
      foo => foo.Foo_Id,
      bar => bar.Foo_Id,
      (f, bs) => new { Foo = f, Bar = bs.FirstOrDefault() });

Otherwise which row you get in the join is going to be random (or more specifically, whichever the db happens to find first).否则,您在联接中获得的哪一行将是随机的(或者更具体地说,无论哪个数据库碰巧先找到)。

Whilst the accepted answer works and is good for Linq to Objects it bugged me that the SQL query isn't just a straight Left Outer Join.虽然接受的答案有效并且对 Linq 到 Objects 有好处,但它让我烦恼的是 SQL 查询不仅仅是一个直接的左外连接。

The following code relies on the LinqKit Project that allows you to pass expressions and invoke them to your query.以下代码依赖于LinqKit 项目,它允许您传递表达式并将它们调用到您的查询中。

static IQueryable<TResult> LeftOuterJoin<TSource,TInner, TKey, TResult>(
     this IQueryable<TSource> source, 
     IQueryable<TInner> inner, 
     Expression<Func<TSource,TKey>> sourceKey, 
     Expression<Func<TInner,TKey>> innerKey, 
     Expression<Func<TSource, TInner, TResult>> result
    ) {
    return from a in source.AsExpandable()
            join b in inner on sourceKey.Invoke(a) equals innerKey.Invoke(b) into c
            from d in c.DefaultIfEmpty()
            select result.Invoke(a,d);
}

It can be used as follows可以如下使用

Table1.LeftOuterJoin(Table2, x => x.Key1, x => x.Key2, (x,y) => new { x,y});

Turning Marc Gravell's answer into an extension method, I made the following.将 Marc Gravell 的答案转换为扩展方法,我做了以下操作。

internal static IEnumerable<Tuple<TLeft, TRight>> LeftJoin<TLeft, TRight, TKey>(
    this IEnumerable<TLeft> left,
    IEnumerable<TRight> right,
    Func<TLeft, TKey> selectKeyLeft,
    Func<TRight, TKey> selectKeyRight,
    TRight defaultRight = default(TRight),
    IEqualityComparer<TKey> cmp = null)
{
    return left.GroupJoin(
            right,
            selectKeyLeft,
            selectKeyRight,
            (x, y) => new Tuple<TLeft, IEnumerable<TRight>>(x, y),
            cmp ?? EqualityComparer<TKey>.Default)
        .SelectMany(
            x => x.Item2.DefaultIfEmpty(defaultRight),
            (x, y) => new Tuple<TLeft, TRight>(x.Item1, y));
}

There is an easy solution to this有一个简单的解决方案

Just use.HasValue in your Select只需在您的 Select 中使用.HasValue

.Select(s => new 
{
    FooName = s.Foo_Id.HasValue ? s.Foo.Name : "Default Value"
}

Very easy, no need for groupjoin or anything else非常简单,不需要 groupjoin 或其他任何东西

Marc Gravell's answer turn into an extension method that support the IQueryable<T> interface is given in this answer and with added support for C# 8.0 NRT reads as follows: Marc Gravell's answer turn into an extension method that support the IQueryable<T> interface is given in this answer ,并增加了对 C# 8.0 NRT 的支持,内容如下:

#nullable enable
using LinqKit;
using LinqKit.Core;
using System.Linq.Expressions;

...

/// <summary>
/// Left join queryable. Linq to SQL compatible. IMPORTANT: any Includes must be put on the source collections before calling this method.
/// </summary>
public static IQueryable<TResult> LeftJoin<TOuter, TInner, TKey, TResult>(
    this IQueryable<TOuter> outer,
    IQueryable<TInner> inner,
    Expression<Func<TOuter, TKey>> outerKeySelector,
    Expression<Func<TInner, TKey>> innerKeySelector,
    Expression<Func<TOuter, TInner?, TResult>> resultSelector)
{
    return outer
        .AsExpandable()
        .GroupJoin(
            inner,
            outerKeySelector,
            innerKeySelector,
            (outerItem, innerItems) => new { outerItem, innerItems })
        .SelectMany(
            joinResult => joinResult.innerItems.DefaultIfEmpty(),
            (joinResult, innerItem) =>
                resultSelector.Invoke(joinResult.outerItem, innerItem));
}

I have this question bookmarked and need to reference it every year or so.我有这个问题的书签,需要每年左右参考。 Each time I revisit this, I find I have forgotten how it works.每次我重温这个,我发现我已经忘记了它是如何工作的。 Here's a more detailed explanation of what's happening.这是对正在发生的事情的更详细的解释。

GroupJoin is like a mix of GroupBy and Join . GroupJoin就像GroupByJoin的混合体。 GroupJoin basically groups the outer collection by the join key, then joins the groupings to the inner collection on the join key. GroupJoin基本上通过连接键对外部集合进行分组,然后将分组连接到连接键上的内部集合。 Suppose we have customers and orders.假设我们有客户和订单。 If you GroupJoin on the respective IDs, the result is an enumerable of {Customer, IGrouping<int, Order>} .如果您在各自的 ID 上进行GroupJoin ,则结果是{Customer, IGrouping<int, Order>}的可枚举。 The reason GroupJoin is useful is because all inner objects are represented even if the outer collection contains no matching objects. GroupJoin有用的原因是,即使外部集合不包含匹配的对象,也会表示所有内部对象。 For customers with no orders, the IGrouping<int, Order> is simply empty.对于没有订单的客户, IGrouping<int, Order>只是空的。 Once we have { Customer, IGrouping<int, Order> } , we can use as-is, filter out results that have no orders, or flatten with SelectMany to get results like a traditional LINQ Join .一旦我们有了{ Customer, IGrouping<int, Order> } ,我们就可以按原样使用,过滤掉没有订单的结果,或者用SelectMany展平以获得像传统的 LINQ Join一样的结果。

Here's a full example if anyone wants to step through with the debugger and see how this works:如果有人想单步调试调试器并了解它是如何工作的,这是一个完整的示例:

using System;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        //Create some customers
        var customers = new Customer[]
        {
            new Customer(1, "Alice"),
            new Customer(2, "Bob"),
            new Customer(3, "Carol")
        };
        
        //Create some orders for Alice and Bob, but none for Carol
        var orders = new Order[]
        {
            new Order(1, 1),
            new Order(2, 1),
            new Order(3, 1),
            new Order(4, 2),
            new Order(5, 2)
        };

        //Group join customers to orders.
        //Result is IEnumerable<Customer, IGrouping<int, Order>>. 
        //Every customer will be present. 
        //If a customer has no orders, the IGrouping<> will be empty.
        var groupJoined = customers.GroupJoin(orders,
                              c => c.ID,
                              o => o.CustomerID,
                              (customer, order) => (customer, order));

        //Display results. Prints:
        //    Customer: Alice (CustomerID=1), Orders: 3
        //    Customer: Bob (CustomerID=2), Orders: 2
        //    Customer: Carol (CustomerID=3), Orders: 0
        foreach(var result in groupJoined)
        {
            Console.WriteLine($"Customer: {result.customer.Name} (CustomerID={result.customer.ID}), Orders: {result.order.Count()}");
        }
        
        //Flatten the results to look more like a LINQ join
        //Produces an enumerable of { Customer, Order }
        //All customers represented, order is null if customer has no orders
        var flattened = groupJoined.SelectMany(z => z.order.DefaultIfEmpty().Select(y => new { z.customer, y }));

        //Get only results where the outer table is null.
        //roughly equivalent to: 
        //SELECT * 
        //FROM A 
        //LEFT JOIN B 
        //ON A.ID = B.ID 
        //WHERE B.ID IS NULL;
        var noMatch = groupJoined.Where(z => z.order.DefaultIfEmpty().Count() == 0);
    }
}

class Customer
{
    public int ID { get; set; }
    public string Name { get; set; }

    public Customer(int iD, string name)
    {
        ID = iD;
        Name = name;
    }
}

class Order
{
    static Random Random { get; set; } = new Random();

    public int ID { get; set; }
    public int CustomerID { get; set; }
    public decimal Amount { get; set; }

    public Order(int iD, int customerID)
    {
        ID = iD;
        CustomerID = customerID;
        Amount = (decimal)Random.Next(1000, 10000) / 100;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM