简体   繁体   中英

Avoiding record duplication in LINQ

I have a situation where records are being duplicated and I don't know how to deal with it. Here is the LINQ statement:

theData = (from urls in this.ObjectContext.Activities.AsExpandable()
            .Where(predicateA)
            .Where(r => (r.StartDate >= beginDate && r.StartDate <= endDate) ||
                  (r.EndDate >= beginDate && r.EndDate <= endDate) ||
                  (r.StartDate <= beginDate && r.EndDate >= endDate))
                join idGroups in this.ObjectContext.IdentityGroups 
                    on urls.IdentityID equals idGroups.IdentityID
                join groupSup in this.ObjectContext.GroupSupervisors
                    .Where(r => r.SupervisorID == loggedInID) 
                    on idGroups.GroupID equals groupSup.GroupID
                join programs in progs 
                    on urls.ProcessName.ToUpper() equals programs.ProcessName.ToUpper() 
            into jt
            from jt1 in jt.DefaultIfEmpty()
                .Where(r => r == null || r.Ignore == false)
            group urls by new 
                        { urls.ProcessName, 
                          urls.ContextID, 
                          jt1.CustomCategory, 
                          jt1.Name, 
                          groupSup.SupervisorID 
                        } 
            into groupedTable
            select new ActivityInfoSummary_DTO
            {
               recId = Guid.NewGuid(),
               Context = groupedTable.Key.ProcessName,
               ContextId = groupedTable.Key.ContextID,
               SupervisorId = groupedTable.Key.SupervisorID,
               FocusCount = groupedTable.Sum(r => r.FocusCount),
               many more fields....
            }).ToList();

The dilemma is: urls.identityId is the ID of the person who created the record.

The person creating the record can belong in more than one group

Each group has a single supervisor

Each supervisor can be the supervisor of multiple groups

A person can belong to multiple groups

The linq statements are trying to filter down the records created by a person based on the fact that the person is a member of a group the supervisor manages (supervisor ID is the loggedInID field in the groupSup filter).

If a person is a memberof of multiple groups the supervisor manges, the record is being reported multiple times and the numbers are being inflated.

That is one of my test cases :( My question is how do I restructure this so if the supervisor manages multiple groups, all the people reporting to them are recorded only once - so a person belonging to 2 or more groups managed by the same supervisor only has their data reported once?

Thanks in advance!!

You maybe able to do this by just introducing another ID into the object model. So during your first piece of the statement:

from urls in this.ObjectContext.Activities.AsExpandable().Where(predicateA)
      .Where(x => x.GroupID == groupSelectionID)
      .Where(r => (r.StartDate >= beginDate && r.StartDate <= endDate) ||
            (r.EndDate >= beginDate && r.EndDate <= endDate) ||
            (r.StartDate <= beginDate && r.EndDate >= endDate))

In that solution, x is just the group that has some type of context. That would mean the urls collection should have objects that have a member property of GroupID.

This is what happens when you work too closely with something for too long. I finally figured it out - I broke the query down into 2 steps, first step gets a list of distinct identities and then uses that list to filter the query by identity.

So:

            var theIDsToInclude = (from id in this.ObjectContext.Identities
                                join idGroups in this.ObjectContext.IdentityGroups
                                    on id.IdentityID equals idGroups.IdentityID
                                join groupSup in this.ObjectContext.GroupSupervisors.Where(r => r.SupervisorID == loggedInID)
                                    on idGroups.GroupID equals groupSup.GroupID
                                select id.IdentityID).Distinct().ToList();


            theData = (from urls in this.ObjectContext.Activities.AsExpandable().Where(predicateA)
                                  .Where(r => (r.StartDate >= beginDate && r.StartDate <= endDate) ||
                                        (r.EndDate >= beginDate && r.EndDate <= endDate) ||
                                        (r.StartDate <= beginDate && r.EndDate >= endDate))
                                        .Where(r=> theIDsToInclude.Contains(r.IdentityID))
                       //join idGroups in this.ObjectContext.IdentityGroups on urls.IdentityID equals idGroups.IdentityID
                       //join groupSup in this.ObjectContext.GroupSupervisors.Where(r => r.SupervisorID == loggedInID) on idGroups.GroupID equals groupSup.GroupID
                       join programs in progs on urls.ProcessName.ToUpper() equals programs.ProcessName.ToUpper() into jt
                       from jt1 in jt.DefaultIfEmpty().Where(r => r == null || r.Ignore == false)
                       group urls by new { urls.ProcessName, urls.ContextID, jt1.CustomCategory, jt1.Name } into groupedTable
                       select new ActivityInfoSummary_DTO
                       {
                           recId = Guid.NewGuid(),
                           Context = groupedTable.Key.ProcessName,
                           ContextId = groupedTable.Key.ContextID,
                           FocusCount = groupedTable.Sum(r => r.FocusCount),
                           many more fields...
                       }).ToList();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM