使用泛型的实体框架相关对象

Question

There are many questions about using Include() and Load() for related table information when using Linq to entities. 在对实体使用Linq时，有关将Include（）和Load（）用于相关表信息存在很多问题。 I have a different spin on this question. 我对这个问题有不同的看法。

My situation: 我的情况：

I have a table that holds many time records for each user in the system, and I use a repository pattern and generics for development, so all of my entities have an interface that they use to standard method calls. 我有一个表，该表为系统中的每个用户保存许多时间记录，并且我使用存储库模式和泛型进行开发，因此我的所有实体都有一个接口，可用于标准方法调用。 I have lazy loading turned off, and I load all data myself. 我已关闭了延迟加载，并且自己加载了所有数据。 So the code in the repository for loading all records from a table with related tables is like this: 因此，存储库中用于从具有相关表的表中加载所有记录的代码如下：

public class Repository<T> : IRepository<T> where T : class
{
    protected readonly ApplicationDbContext Context;

    public Repository(IConnectionHelper connection)
    {
        Context = connection.Context;
    }
    public virtual DbSet<T> ObjectSet
    {
        get { return Context.Set<T>(); }
    }  
    public List<T> GetAll(String[] include, Expression<Func<T, bool>> predicate)
    {
        DbQuery<T> outQuery = null;
        foreach (String s in include)
        {
            outQuery = ObjectSet.Include(s);
            outQuery.Load();
        }
        return outQuery.Where(predicate).ToList();
    }
}

The call to the method is like this: 该方法的调用如下：

string[] includes = { "User.UserProfile", "CampaignTimeClocks.CampaignRole.Campaign", "Site", "Type" };
DateTime uTcCurrent = GetUtc();
DateTime MinClockinDate = uTcCurrent.AddHours(-10);
List<TimeClock> tcPending = _timeClock.GetAll(includes, x => (x.PendingReview || x.ClockInDate < MinClockinDate && x.ClockOutDate == null) && (x.Site.Id == currentUser.SiteId));

When this method runs and loads the first User.Profile table, it loads all the timeclock records and relates them to the all of the users, this take upwards of of a minute, this is way too long, since the end record count is only 185 records, but initial load of the query is running 27,000 * 560 users, or 15 million records, and this in only going to get much worse as time goes on. 当此方法运行并加载第一个User.Profile表时，它将加载所有时间记录并将其与所有用户相关联，这需要一分钟以上的时间，这太长了，因为最终记录数仅185条记录，但查询的初始负载正在运行27,000 * 560用户，即1500万条记录，并且随着时间的推移，这种情况只会变得更糟。

The question is how do I do this without this load overhead, I know I can chain includes, but since the number of includes is going to change depending on what is and what I am doing with the data called, I cannot simply hard code a chain of includes. 问题是如何在没有这种负载开销的情况下如何做到这一点，我知道我可以链接包含，但是由于包含的数量将根据所调用的数据的类型和用途而改变，因此我不能简单地对a进行硬编码包含链。

I have also tried: 我也尝试过：

List<TimeClock> testLst =  _timeClock.GetAll(x => x.PendingReview || 
     (x.ClockInDate < MinClockinDate && x.ClockOutDate == null))
          .Select(x => new TimeClock{Id = x.Id,
                                     ClockInDate = x.ClockInDate, 
                                     ClockOutDate = x.ClockOutDate,
                                     TotalClockTime = x.TotalClockTime,
                                     Notes = x.Notes, 
                                     PendingReview = x.PendingReview, 
                                     Type = x.Type,
                                     User = x.User, 
                                     CampaignTimeClocks = x.CampaignTimeClocks,
                                     TimeClockAdjustments = x.TimeClockAdjustments,
                                     Site = x.User.Site}).ToList();

This will give me the User.Profile information but the Site and Type properties are null. 这将为我提供User.Profile信息，但Site和Type属性为null。

So I am a bit lost as to how to load the data I need here. 因此，我对如何在此处加载所需的数据感到迷茫。

All help is greatly appreciated. 非常感谢所有帮助。

Answer 1

Can you get the initial list first 你能先得到初始清单吗

List<TimeClock> testLst =  _timeClock.Where(x => x.PendingReview || (x.ClockInDate < MinClockinDate && x.ClockOutDate == null)).ToList();

and then call a modified GetAll() that takes a T as an argument? 然后调用以T作为参数的修改后的GetAll() ？

Answer 2

Each include you do will end up with a join being executed in the db. 您所做的每个include最终都会在db中执行联接。 Suppose your left table is very big 1024 bytes in record size and that you have many details, say 1000 and and that the detail record size is only 100. This will result in the information for the left table to be repeated 1000 times, this information is going to be put on the wire by the db and EF has to filter out the duplicated to create your left instance. 假设您的左表的记录大小非常大，为1024个字节，并且您有许多详细信息，例如1000，并且详细记录的大小仅为100。这将导致左表的信息重复1000次，该信息db将被连接起来，EF必须过滤掉重复的副本以创建您的左实例。

It can be better to not use include and do an explicit load. 最好不使用include并进行显式加载。 Basically executing 2 queries on the same context. 基本上在相同的上下文中执行2个查询。

I have an example like this, different than yours but i hope you get the idea. 我有一个这样的例子，与您的例子不同，但我希望您能理解。 It can be up to 10 times faster than relying on include. 它可能比依赖include快多达10倍。 (A db can handle only a limitited number of joins efficiently btw) （数据库只能有效地处理有限数量的联接）

var adressen = adresRepository
                .Query(r => r.RelatieId == relatieId)
                .Include(i => i.AdresType)
                .Select().ToList();

var adresids = (from a in adressen select a.AdresId).ToList();
            IRepositoryAsync<Comm> commRepository = unitOfWork.RepositoryAsync<Comm>();

            var comms = commRepository
                .Query(c => adresids.Contains(c.AdresId))
                .Include(i => i.CommType)
                .Select();

For the commType and adresType I use include because there is a 1 to 1 relationship, I am avoiding too many joins and thus my multiple queries will be faster than a single one using include. 对于我使用include的commType和adresType，因为存在1对1的关系，我避免了太多的联接，因此我的多个查询将比使用include的单个查询更快。 I am not including the Comms in the first query to try and avoid the second query, the point is that 2 queries are faster in this case than a single one. 我没有在第一个查询中包括Comms来尝试避免第二个查询，关键是在这种情况下，两个查询比单个查询要快。

Note that my code is built using my own repositories, so this code will not work for you, but you can get the idea behind this. 请注意，我的代码是使用我自己的存储库构建的，因此该代码对您不起作用，但是您可以从中得到启发。

Answer 3

The way that I found to do this more efficiently without using the Load() statement is to change the DBQuery to an IQueryable and chain the includes, and return the executed query results, and Remove the DBQuery.Load() all together. 我发现不使用Load（）语句即可更有效地执行此操作的方法是将DBQuery更改为IQueryable并链接包含，返回执行的查询结果，以及一起删除DBQuery.Load（）。 This changed the execution time of the query to milliseconds from seconds. 这将查询的执行时间从秒更改为毫秒。

    public List<T> GetAll(String[] include)
    {
        IQueryable<T> outQuery = ObjectSet;
        foreach (String s in include)
        {
            outQuery = outQuery.Include(s);
        }
        return outQuery.ToList();
    }

使用泛型的实体框架相关对象

问题描述

3 个解决方案

解决方案1
0 2015-02-12 15:39:40

解决方案2
0 2015-02-12 19:29:51

解决方案3
0 2015-02-16 19:15:43

使用泛型的实体框架相关对象

问题描述

3 个解决方案

解决方案1 0 2015-02-12 15:39:40

解决方案2 0 2015-02-12 19:29:51

解决方案3 0 2015-02-16 19:15:43

解决方案1
0 2015-02-12 15:39:40

解决方案2
0 2015-02-12 19:29:51

解决方案3
0 2015-02-16 19:15:43