简体   繁体   中英

Most efficient (& fastest) way to query a list

I'm trying to work out the most performant way to query a list. I know that there are a ton of examples out there and this has come up loads before, but I'm really new to this and I'm struggling with how to apply some of the concepts to my situation.

private static void KeepMatchesBasedOnRestrictions(ref List<Entity> matches, 
        List<Entity> preFilteredShifts, List<Entity> locationalInformations)
    {
        if (matches.Count == 0) return;

        matches.RemoveAll(

            (match) => ( GeographyHasRestriction(match, preFilteredShifts, locationalInformations) )

            );
    }

private static bool GeographyHasRestriction(Entity match, List<Entity> preFilteredShifts, List<Entity> locationalInformations)
    {                  
        EntityReference fw = match.GetAttributeValue<EntityReference>("crm_fw");

        Entity shift = preFilteredShifts.Single<Entity>( 
                a => match.GetAttributeValue<EntityReference>("crm_shift").Id == a.Id
            );
        EntityReference trust = shift.GetAttributeValue<EntityReference>("crm_trust");
        EntityReference location = shift.GetAttributeValue<EntityReference>("crm_location");
        EntityReference ward = shift.GetAttributeValue<EntityReference>("crm_ward");

        Dictionary<Guid, Entity> locInfoRecs = locationalInformations.ToDictionary(p => p.Id);

        var locationalInformationQuery = from loc in locationalInformations
                                         where (
                                            (
                                                loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                                && !loc.Contains("crm_trust")
                                                && !loc.Contains("crm_location")
                                                && !loc.Contains("crm_ward")
                                            )
                                            ||
                                            (
                                                loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                                && loc.GetAttributeValue<EntityReference>("crm_trust").Id == trust.Id
                                                && !loc.Contains("crm_location")
                                                && !loc.Contains("crm_ward")
                                            )
                                            ||
                                            (
                                                loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                                && loc.GetAttributeValue<EntityReference>("crm_trust").Id == trust.Id
                                                && loc.GetAttributeValue<EntityReference>("crm_location").Id == location.Id
                                                && !loc.Contains("crm_ward")
                                            )
                                            ||
                                            (
                                                loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                                && loc.GetAttributeValue<EntityReference>("crm_trust").Id == trust.Id
                                                && loc.GetAttributeValue<EntityReference>("crm_location").Id == location.Id
                                                && loc.GetAttributeValue<EntityReference>("crm_ward").Id == ward.Id
                                            )
                                         )
                                         select loc;

        foreach (Entity loc in locationalInformationQuery)
        {
            if (loc.GetAttributeValue<bool>("crm_hasrestriction"))
            {
                return true;
            }
        }

        //return false;
    }

So I think my problem is 2-fold;

  1. The locationalInformationQuery query seems to run very slowly... I'm talking something in the region of up to 2 seconds per iteration which is horrible.
  2. I also suspect that the approach of calling matches.RemoveAll() is also somewhat flawed due to the performance issues regarding lists.

So in terms of addressing this, I think that I may be able to get better performance by converting my locationalInformations list to some other type of container such as a Dictionary , HashSet or SortedList . My problem then is that I have no idea how to go about adjusting my query to to take advantage of those more efficient containers.

As far as the second point goes, I'd also be curious to hear about alternatives to using List.RemoveAll() . I have the flexibility to modify my incoming container types within reason to this may be viable.

With regards the list sizes in case its of any use, match contains a few thousand items and preFilteredShifts and locationalInformations each contain > 100,000 items.

As an aside I've tried using Parallel.ForEach instead of foreach , but it made virtually no difference whatsoever.

Edit: Just to clarify some questions, I'm doing all this in memory. I've already completely populated all of my lists so there shouldn't be any additional round trips to the DB. I'm reasonably certain that GetAttributeValue<EntityReference> doesn't initial further DB overhead.

Also, yes this is a local application calling Dynamics CRM Online.

The code -

foreach (Entity loc in locationalInformationQuery)
    {
        if (loc.GetAttributeValue<bool>("crm_hasrestriction"))
        {
            return true;
        }
    }

Can be one reason for slowness. You are fetching more data and then enumerating them in memory. You can perform the check directly before fetch, so you will fetch lesser data and could be faster. Something like this -

return (from loc in locationalInformations
                                     where ((
                                        (
                                            loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                            && !loc.Contains("crm_trust")
                                            && !loc.Contains("crm_location")
                                            && !loc.Contains("crm_ward")
                                        )
                                        ||
                                        (
                                            loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                            && loc.GetAttributeValue<EntityReference>("crm_trust").Id == trust.Id
                                            && !loc.Contains("crm_location")
                                            && !loc.Contains("crm_ward")
                                        )
                                        ||
                                        (
                                            loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                            && loc.GetAttributeValue<EntityReference>("crm_trust").Id == trust.Id
                                            && loc.GetAttributeValue<EntityReference>("crm_location").Id == location.Id
                                            && !loc.Contains("crm_ward")
                                        )
                                        ||
                                        (
                                            loc.GetAttributeValue<EntityReference>("crm_fw").Id == fw.Id
                                            && loc.GetAttributeValue<EntityReference>("crm_trust").Id == trust.Id
                                            && loc.GetAttributeValue<EntityReference>("crm_location").Id == location.Id
                                            && loc.GetAttributeValue<EntityReference>("crm_ward").Id == ward.Id
                                        )
                                     ) && loc.GetAttributeValue<bool>("crm_hasrestriction")) // do the check before fetch in here
                                     select loc).Any(); 

I have on occasion found that querying CRM databases can result in inefficient queries when what the query you are working with is sufficiently complex.

Sometimes this can be due to how the query is generated depending on your method of querying the database, or sometimes it can be that iterating some IEnumerable type collections and checking conditions can result in many SQL queries to the database per iteration. Maybe check what is happening under the covers against your database by using SQL Profiler. It may turn out to be insightful.

One option I've reverted to on occasion where I feel the CRM query limitations simply hamper performance too much is to fall back to straight ADO.NET and SQL against the filtered views where I have access to query plans and have a much better idea and understanding of what is happening. I'm sure many CRM purists are frowning at me right now, but I think it is a fair call in terms of the end users experience and also making your code relatively understandable too. Complex queries can be quite unwieldy in code, and having a SQL query you can refer to, can help immensely in comprehension of your solution. You can also benefit from set based operations and a less "chatty" interface in terms of the number of resultant database calls.

In your question above, if you feel this may be a good option, I'd look at prototyping such a solution by providing a method like ;

private static bool GeographyHasRestrictionBySql(Entity match, List<Entity> preFilteredShifts, List<Entity> locationalInformations)
{
     // Query here, and determine your boolean result to return
}

That way you can simply test this quickly and easily by a change in the calling function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM