简体   繁体   English

在实体框架中选择不同的行

[英]Selecting distinct rows in Entity Framework

I'm trying to query an SQL Server view with Entity Framework and only return rows that are distinct based on multiple columns.我正在尝试使用实体框架查询 SQL 服务器视图,并且仅返回基于多列不同的行。

I've tried a solution based on this answer ( GroupBy and then Select(g => g.FirstOrDefault()) , but I'm still getting duplicate rows.我已经尝试了基于这个答案的解决方案( GroupBy然后Select(g => g.FirstOrDefault()) ,但我仍然得到重复的行。

Table structure (this is a fairly complex view in the real database, but the final output is similar in structure to this example):表结构(这是真实数据库中比较复杂的视图,但最终的output在结构上与本例类似):

CREATE TABLE Example (
    ID_A BIGINT,
    ID_B BIGINT,
    ID_C BIGINT,

    Type_A NVARCHAR(50),
    Type_B NVARCHAR(50),

    ID_Z BIGINT,

    Foo NVARCHAR(200),
    Bar NVARCHAR(200)
)

Example data:示例数据:

INSERT INTO Example (ID_A, ID_B, ID_C, Type_A, Type_B,  ID_Z,  Foo, Bar)
VALUES (1, 1, 1, 'TypeA1', 'TypeB1',  1,  'foo1', 'bar1'), -- This row and the next one represent the same main record (1) joined to different "Z" records (1 and 2)
       (1, 1, 1, 'TypeA1', 'TypeB1',  2,  'foo1', 'bar1'),
       (2, 1, 2, 'TypeA2', 'TypeA2',  1,  'foo2', 'bar2'), -- This row and the next two represent the same main record (2) joined to different "Z" records (1, 2 and 3)
       (2, 1, 2, 'TypeA2', 'TypeA2',  2,  'foo2', 'bar2'),
       (2, 1, 2, 'TypeA2', 'TypeA2',  3,  'foo2', 'bar2')

Entity class:实体 class:

public class ExampleEntity
{
    [Key]
    public long ID_A { get; set; }
    public long ID_B { get; set; }
    public long ID_C { get; set; }

    public string Type_A { get; set; }
    public string Type_B { get; set; }

    public long? ID_Z { get; set; }

    public string Foo { get; set; }
    public string Bar { get; set; }

Entity configuration:实体配置:

public class ExampleEntityConfiguration : EntityTypeConfiguration<ExampleEntity>
{
    public ExampleEntityConfiguration()
    {
        // Properties
        this.Property(t => t.ID_A)
            .HasDatabaseGeneratedOption(DatabaseGeneratedOption.None);

        // Table & Column Mappings
        this.ToTable("Example");
        this.Property(t => t.ID_A).HasColumnName("ID_A");
        this.Property(t => t.ID_B).HasColumnName("ID_B");
        this.Property(t => t.ID_C).HasColumnName("ID_C");

        this.Property(t => t.Type_A).HasColumnName("Type_A");
        this.Property(t => t.Type_B).HasColumnName("Type_B");

        this.Property(t => t.ID_Z).HasColumnName("ID_Z");

        this.Property(t => t.Foo).HasColumnName("Foo");
        this.Property(t => t.Bar).HasColumnName("Bar");
    }
}

ID_A , ID_B , ID_C , Type_A and Type_B identify the "main" entity and ID_Z identifies the joined "Z" entity. ID_AID_BID_CType_AType_B标识“主”实体, ID_Z标识连接的“Z”实体。 Foo and Bar are non-unique data columns that need to be included in the final results. FooBar是需要包含在最终结果中的非唯一数据列。

For each combination of main ID/Type values, there can be multiple ID_Z values.对于主 ID/Type 值的每个组合,可以有多个ID_Z值。 I need to filter by the ID_Z values and then return the distinct main entity values (based on the ID/Type values).我需要按ID_Z值进行过滤,然后返回不同的主要实体值(基于 ID/Type 值)。

I've tried a query like the following:我尝试了如下查询:

// The `ID_Z` values to filter on
var zIDs = new List<long> { 1, 2 };

var result = context.Set<ExampleEntity>()
       .Where(e => zIDs.Contains(e.ID_Z))
       .GroupBy(e => new { e.ID_A, e.ID_B, e.ID_C, e.Type_A, e.Type_B })
       .Select(g => g.FirstOrDefault())
       .ToList();

Which results in this SQL:这导致了这个 SQL:

SELECT 
    [Limit1].[ID_A] AS [ID_A], 
    [Limit1].[ID_B] AS [ID_B], 
    [Limit1].[ID_C] AS [ID_C], 
    [Limit1].[Type_A] AS [Type_A], 
    [Limit1].[Type_B] AS [Type_B], 
    [Limit1].[ID_Z] AS [ID_Z], 
    [Limit1].[Foo] AS [Foo], 
    [Limit1].[Bar] AS [Bar]
FROM   (
    SELECT [Extent1].[ID_A] AS [ID_A], [Extent1].[ID_B] AS [ID_B], [Extent1].[ID_C] AS [ID_C], [Extent1].[Type_A] AS [Type_A], [Extent1].[Type_B] AS [Type_B]
    FROM [dbo].[Example] AS [Extent1] WITH (NOLOCK)
    WHERE 
        ([Extent1].[ID_Z] IN (cast(1 as bigint), cast(2 as bigint))) AND ([Extent1].[ID_Z] IS NOT NULL)
) AS [Filter1]
OUTER APPLY  (
    SELECT TOP (1) 
        [Extent2].[ID_A] AS [ID_A], 
        [Extent2].[ID_B] AS [ID_B], 
        [Extent2].[ID_C] AS [ID_C], 
        [Extent2].[Type_A] AS [Type_A], 
        [Extent2].[Type_B] AS [Type_B], 
        [Extent2].[ID_Z] AS [ID_Z], 
        [Extent2].[Foo] AS [Foo], 
        [Extent2].[Bar] AS [Bar]
     FROM [dbo].[Example] AS [Extent2] WITH (NOLOCK)
     WHERE 
        ([Extent2].[ID_Z] IN (cast(1 as bigint), cast(2 as bigint))) AND ([Extent2].[ID_Z] IS NOT NULL) AND 
        ([Filter1].[ID_A] = [Extent2].[ID_A]) AND 
        ([Filter1].[ID_B] = [Extent2].[ID_B]) AND 
        ([Filter1].[ID_C] = [Extent2].[ID_C]) AND 
        (([Filter1].[Type_A] = [Extent2].[Type_A]) OR (([Filter1].[Type_A] IS NULL) AND ([Extent2].[Type_A] IS NULL))) AND 
        (([Filter1].[Type_B] = [Extent2].[Type_B]) OR (([Filter1].[Type_B] IS NULL) AND ([Extent2].[Type_B] IS NULL))) 
) AS [Limit1]

But this seems to return all of the rows matching the Z_ID filter (resulting in duplicate "main" values) instead of only returning the first row for each set of "main" ID/Type values.但这似乎返回了与Z_ID过滤器匹配的所有行(导致重复的“主”值),而不是只返回每组“主”ID/Type 值的第一行。

If I materialise ( ToList ) the query directly after GroupBy , I seem to get the correct groupings;如果我在GroupBy之后直接实现( ToList )查询,我似乎得到了正确的分组; but I'd like to run this all in the DB and avoid using LINQ to Objects queries.但我想在数据库中运行这一切,并避免将 LINQ 用于对象查询。

How can I create this query?如何创建此查询?

(Because apparently this was never done, I'm copying Ivan Stoev's comment below as "the answer") (因为显然这从未做过,所以我在下面复制Ivan Stoev 的评论作为“答案”)

As I was expecting, the problem is caused by the PK definition of the entity.正如我所料,问题是由实体的 PK 定义引起的。 By marking ID_A as PK essentially you are telling EF that this field is unique, hence it will assume any combination of fields including that field as unique, and will remove the normal GroupBy / Distinct operator.通过将 ID_A 标记为 PK,本质上您是在告诉 EF 该字段是唯一的,因此它将假定包括该字段在内的任何字段组合都是唯一的,并将删除正常的 GroupBy / Distinct 运算符。 You really need to define a real unique key as PK for that entity.您确实需要为该实体定义一个真正的唯一键作为 PK。 Eventually including all fields, but that doesn't work with nullable fields.最终包括所有字段,但这不适用于可为空的字段。 Better extend the database view to include ROW_NUMBER or NEWID based calculated column and map it as PK.更好地扩展数据库视图以包含基于 ROW_NUMBER 或 NEWID 的计算列,并将 map 作为 PK。 – Ivan Stoev Oct 10 at 15:58 – Ivan Stoev 10 月 10 日 15:58

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM