为什么实体框架dbSet.Include需要这么长时间才能返回？

Question

I'm doing a simple "get" on a table and the "include" part of the routine is taking much longer than I would expect. 我在表上做一个简单的“获取”，例程的“包含”部分花费的时间比我期望的要长得多。

I've narrowed the performance issue down to this snippet of code: 我将性能问题缩小到以下代码段：

private List<Task> GetFilteredTasksWithOptionalIncludes(EntityRepository<Task> repo, ITaskRequest model, TaskIncludesModel includes, string accountID)
{
    var includedEntities = new List<Expression<Func<Task, object>>>();

    includedEntities.Add(t => t.Document.Transaction.Account);

    includedEntities.Add(p => p.Signature);

    if (includes != null)
    {
        if (includes.IncludeWorkflowActions)
        {
            includedEntities.Add(p => p.Actions);
        }

        if (includes.IncludeFileAttachments)
        {
            includedEntities.Add(p => p.Attachments);
        }
    }

    IQueryable<Task> tasks = repo.GetAllIncluding(includedEntities.ToArray());  //RETURNS SLOW

    return tasks.ToList();  //RETURNS FAST
}

All my code runs very fast until it hits the repo.GetAllInclude method shown below: 我的所有代码运行都非常快，直到达到repo.GetAllInclude方法，如下所示：

public IQueryable<TEntity> GetAllIncluding(params Expression<Func<TEntity, object>>[] includeProperties)
{
    foreach (var includeProperty in includeProperties)
    {
        dbSet.Include(includeProperty).Load();
    }

    return dbSet;
}

As I step through the code, it takes up to 6 or 7 seconds for the GetAllIncluded() line to return but it takes less than a second for the tasks.ToList() to return (which is when the actual SQL query gets run so this surprises me). 当我逐步执行代码时，GetAllIncluded（）行最多需要6或7秒的时间才能返回，但task.ToList（）返回的时间不到一秒钟（这是实际的SQL查询运行时，因此这让我感到惊讶）。

When I commented out the foreach loop that loads the includes and the entire call returns in under a second. 当我注释掉加载include的foreach循环时，整个调用在不到一秒的时间内返回。 What causes the includes to take so long? 是什么原因导致包含内容花费这么长时间？ And is there some better way to do this? 还有更好的方法吗？

Here is the SQL Profiler around the call if it helps. 如果有帮助，这里是调用周围的SQL事件探查器。 Everything above the red line is from the GetAllIncluded call. 红线上方的所有内容均来自GetAllIncluded调用。 Everything below the red line is the actual query for the data. 红线下方的所有内容都是对数据的实际查询。 Is there a more efficient way to do this? 有没有更有效的方法可以做到这一点？ Doesn't seem like it should take 10 seconds for a fairly simple call. 看起来很简单的通话似乎不需要10秒。 在此处输入图片说明

Answer 1

When you are calling .Load() in that loop, you are actually going away to the database and bringing the data into the context. 当您在该循环中调用.Load()时，实际上是在访问数据库并将数据放入上下文中。

So depending on how often you are going to that loop, you are running that query over and over. 因此，根据您要执行该循环的频率，您会不断地运行该查询。

I would suggest removing the .Load() but you can still keep your including function. 我建议删除.Load()但您仍然可以保留.Load()函数。 A basic Including function for a generic repository would be something like this: 通用存储库的基本“包含”功能如下所示：

public IQueryable<TEntity> Including(params Expression<Func<TEntity, object>>[] _includeProperties)
    {
        IQueryable<TEntity> query = context.Set<TEntity>();
        return _includeProperties.Aggregate(query, (current, includeProperty) => current.Include(includeProperty));
    }

And once you have called this and got hold of your IQueryable , simply call .ToList() on it to only pull from SQL once. 一旦调用了此方法并掌握了IQueryable ，只需对其调用.ToList()即可仅从SQL中提取一次。

Have a read of http://msdn.microsoft.com/en-gb/data/jj592911.aspx to see what the Load method is actually doing. 阅读http://msdn.microsoft.com/en-gb/data/jj592911.aspx，以了解Load方法的实际作用。

Edit based on your comments: 根据您的评论进行编辑：

You could implement the above function I posted and use it similar to what you are doing now, and implicitly calling Load() afterwards on the tasks queryable if you want. 您可以实现我发布的上述功能，并以与您现在正在使用的方式类似的方式使用它，然后在需要时可查询的任务上隐式调用Load() 。

private List<Task> GetFilteredTasksWithOptionalIncludes(EntityRepository<Task> repo, ITaskRequest model, TaskIncludesModel includes, string accountID)
{
    var includedEntities = new List<Expression<Func<Task, object>>>();

    includedEntities.Add(t => t.Document.Transaction.Account);

    includedEntities.Add(p => p.Signature);

    if (includes != null)
    {
        if (includes.IncludeWorkflowActions)
        {
            includedEntities.Add(p => p.Actions);
        }

        if (includes.IncludeFileAttachments)
        {
            includedEntities.Add(p => p.Attachments);
        }
    }

    IQueryable<Task> tasks = repo.Including(includedEntities.ToArray());  
    tasks.Load();
    return tasks.ToList();  
}

Answer 2

Or as an after step to Thewads just write a good old sproc to return all this in multiple sets at once then you can optimise the database schema with appropriate indexes and have it run as fast as possible. 或者作为Thewads的后续步骤，只需编写一个很好的旧sproc以一次将所有这些返回多个集合中即可，然后可以使用适当的索引优化数据库架构，并使其尽快运行。

That probably isn't going to be popular but when you start talking db performance it is the quicker way (and easier to work with because you have all the sql tooling for performance tuning) ... eg turn on query execution plans. 那可能不会流行，但是当您开始谈论数据库性能时，它是一种更快的方法（并且更容易使用，因为您拥有所有用于性能调整的sql工具）...例如，打开查询执行计划。 You might find something insane is happening. 您可能会发现发疯了。

Another question would be ... do you really need all this data is there no filter conditions you can apply before you load it all? 另一个问题是……您真的需要所有这些数据吗，在加载所有数据之前是否没有可以应用的过滤条件？ (assuming you fix the multiple loading issue) （假设您解决了多次加载问题）

为什么实体框架dbSet.Include需要这么长时间才能返回？

问题描述

2 个解决方案

解决方案1
1 2014-05-01 14:13:55

解决方案2
-1 2014-05-01 14:17:16

为什么实体框架dbSet.Include需要这么长时间才能返回？

问题描述

2 个解决方案

解决方案1 1 2014-05-01 14:13:55

解决方案2 -1 2014-05-01 14:17:16

解决方案1
1 2014-05-01 14:13:55

解决方案2
-1 2014-05-01 14:17:16