简体   繁体   English

使用Entity Framework执行简单查询时出现严重的性能问题

[英]Severe performance problems when performing a simple query using Entity Framework

I've got a fairly generic CRUD webapp, which generates pages dynamically according to the contents of several database tables. 我有一个相当通用的CRUD webapp,它根据几个数据库表的内容动态生成页面。 I'm using Entity Framework 4.0 to pull this data out of the DB, however I'm running into severe performance problems. 我正在使用Entity Framework 4.0将这些数据从数据库中提取出来,但是我遇到了严重的性能问题。 I've managed to iterate down into a problem which is contained enough that I can detail below. 我已经设法迭代到一个足够包含的问题,我可以在下面详述。

I have a table containing list of Page Forms (~200). 我有一个包含页面表单列表的 (~200)。 Each form has one or more Fields (~4000 total), and each field has may have some Parameters (~16000 total). 每个表单都有一个或多个字段 (总共约4000个),每个字段可能有一些参数 (总共约16000个)。

I've attached a screenshot of my model below: 我在下面附上了我的模型的截图:

实体模型

The associated entity objects are as follows: 关联的实体对象如下:

public class Form
{
    public int FormID { get; set; }
    public string FormName { get; set; }

    public IList<FormField> FormFields { get; set; }

}

public class FormField
{
    public int FieldID { get; set; }
    public string FieldName { get; set; }
    public int FormID{ get; set; } 

    public IList<FormFieldParameter> FormFieldParameters { get; set; }
    public Form ParentForm { get; set; }

}

public class FormFieldParameter
{
    public int FieldParamID{ get; set; }
    public string Value{ get; set; }
    public int? FieldID { get; set; }

    public FormField ParentField { get; set; }
}

The following code pulls out all data for the Form which has an ID of '1'. 以下代码提取ID为“1”的表单的所有数据。

EntityConnection myConnection = new EntityConnection("name=myModel");

if(conn.State != ConnectionState.Open) {
    conn.Open();
}
ObjectContext context = new ObjectContext("name=myModel");
context.ContextOptions.LazyLoadingEnabled = false;

ObjectQuery<PageForm> myObjectSet = context.CreateObjectSet<PageForm>()
                                           .Include("FormField.FormFieldParameter");

//Edit: I missed this part out, sorry. In hindsight, this was exactly what was
//causing the issue.
IEnumerable<PageForm> myObjectSetEnumerable = myObjectSet.AsEnumerable();
IQueryable<PageForm> myFilteredObjectSet = myObjectSetEnumerable.Where(c => c.FormID == 1)
                                                                .AsQueryable();


List<PageForm> myReturnValue = myFilteredObjectSet.toList();

Now, while this does work, it runs really poorly. 现在,虽然这确实有效,但它运行得非常糟糕。 The query takes over a second to run, the entirety of which is spent in the myFilteredObjectSet.toList() call. 查询需要花费一秒钟才能运行,其全部内容都花费在myFilteredObjectSet.toList()调用中。 I ran a profiler on my database to see what was causing the delay, and found that the following query was being generated: 我在我的数据库上运行了一个分析器以查看导致延迟的原因,并发现正在生成以下查询:

SELECT 
[Project1].[FormID] AS [FormID], 
[Project1].[FormName] AS [FormName], 
[Project1].[C2] AS [C1], 
[Project1].[FormID1] AS [FormID1], 
[Project1].[FieldID] AS [FieldID], 
[Project1].[FieldName] AS [FieldName], 
[Project1].[C1] AS [C2], 
[Project1].[FieldParamID] AS [FieldParamID], 
[Project1].[Value] AS [Value], 
[Project1].[FieldID1] AS [FieldID1]
FROM ( SELECT 
    [Extent1].[FormID] AS [FormID], 
    [Extent1].[FormName] AS [FormName], 
    [Join1].[FieldID] AS [FieldID], 
    [Join1].[FieldName] AS [FieldName], 
    [Join1].[FormID] AS [FormID1], 
    [Join1].[FieldParamID] AS [FieldParamID], 
    [Join1].[Value] AS [Value], 
    [Join1].[FieldID1] AS [FieldID1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[FieldParamID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [dbo].[PageForm] AS [Extent1]
    LEFT OUTER JOIN  (SELECT [Extent2].[FieldID] AS [FieldID], [Extent2].[FieldName] AS [FieldName], [Extent2].[FormID] AS [FormID], [Extent3].[FieldParamID] AS [FieldParamID], [Extent3].[Value] AS [Value], [Extent3].[FieldID] AS [FieldID1]
        FROM  [dbo].[FormField] AS [Extent2]
        LEFT OUTER JOIN [dbo].[FormFieldParameter] AS [Extent3] ON [Extent2].[FieldID] = [Extent3].[FieldID] ) AS [Join1] ON [Extent1].[FormID] = [Join1].[FormID]
)  AS [Project1]
ORDER BY [Project1].[FormID] ASC, [Project1].[C2] ASC, [Project1].[FieldID] ASC, [Project1].[C1] ASC

The duration of this query shown on the sql profiler shows that this query is what is taking so long to run. sql profiler上显示的此查询的持续时间显示此查询正在运行这么长时间。 The interesting thing about the query, is that there is no filtering on it at all - It is returning the entire tree! 关于查询的有趣之处在于它根本没有对它进行过滤 - 它返回整个树! I can't understand why it is returning everything, as the filter myObjectSet.Where(c => c.FormID == 1) is pretty explicit. 我无法理解为什么它会返回所有内容,因为过滤器myObjectSet.Where(c => c.FormID == 1)非常明确。 The actual returned object only contains a single entry, which I would expect. 实际返回的对象只包含一个我想要的条目。

I'm having this problem across my entire data access layer, and its performance is appalling. 我在整个数据访问层中遇到此问题,其性能令人震惊。 I have no idea why the generated query doesn't contain the filter - and no idea how to tell it to do so. 我不知道为什么生成的查询不包含过滤器 - 并且不知道如何告诉它这样做。 Does anybody know the answer? 有人知道答案吗?

TL;DR Remove the AsEnumerable call and replace it with an AsQueryable call and it should resolve most of the performance issues (outside of actual database execution cost being slow, which is fixed by adding indexes on columns you are filtering / joining on). TL; DR删除AsEnumerable调用并将其替换为AsQueryable调用,它应解决大多数性能问题(实际数据库执行成本之外的速度很慢,通过在要过滤/加入的列上添加索引来修复)。

Explanation of what is actually happening... 解释实际发生的事情......

As soon as you call AsEnumerable you are now outside of Entity Framework and in the world of LINQ-to-objects. 一旦调用AsEnumerable您就会出现在Entity Framework之外以及LINQ-to-objects的世界中。 That means it's going to execute the query against the database when it is enumerated against. 这意味着它将在枚举时针对数据库执行查询。 It doesn't matter that you call AsQueryable again, that merely means that you are creating a query against an in memory structure. 再次调用AsQueryable并不重要,这仅仅意味着您要在内存结构中创建查询。

The effective execution is this. 有效执行是这样的。

  1. Create an object query, including all FormFieldProperties linked to the form 创建一个对象查询,包括链接到表单的所有FormFieldProperties
  2. Transform the current IQueryable instance into an enumerable. 将当前IQueryable实例转换为可枚举实例。
  3. Add a predicate against the enumerable instance which will only return items whose FormID value is one. 为可枚举实例添加谓词,该实例仅返回FormID值为1的项。
  4. Call ToList, which copies all values from source enumerable to a list. 调用ToList,它将源可枚举的所有值复制到列表中。

Now, up until step 4, the query actually hasn't queried the database. 现在,直到第4步,查询实际上还没有查询数据库。 When you call ToList , it executes the query in step one (as you see). 当您调用ToList ,它会在第一步中执行查询(如您所见)。 This query likely is expensive and takes a while because of the amount of data it is returning and/or missing indexes that may improve it's performance. 此查询可能很昂贵并且需要一段时间,因为它返回的数据量和/或缺少可能提高其性能的索引。

Once that query is done and materialized, it's result is wrapped in an enumerator. 完成该查询并实现后,其结果将包含在枚举器中。

Now, every object is iterated and checked to see if it matches the predicate that was added in step 3. If it does match, then it is returned to whoever is iterating over it (in this case, the ToList function). 现在,迭代并检查每个对象以查看它是否与步骤3中添加的谓词匹配。如果它匹配,则返回给在其上迭代的任何人(在本例中为ToList函数)。

Now that the value has been returned, it is added to the list that is being created with the values. 现在已返回该值,它将添加到使用值创建的列表中。

Finally, you get a list back from the ToList method, and it has exactly what you asked for, but it did all of that in memory rather than in the database. 最后,您从ToList方法返回一个列表,它完全符合您的要求,但它在内存中而不是在数据库中完成所有这些操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM