简体   繁体   English

需要帮助优化Linq的循环

[英]Need help optimizing loop with Linq

Disclaimer: I have little experience with linq. 免责声明:我对linq经验不足。

One of my tasks at my work is to maintain an e commerce web site. 我工作的任务之一就是维护一个电子商务网站。 Yesterday, one of our customers started complaining of a timeout that would occur when they tried to create a feed file for google. 昨天,我们的一位客户开始抱怨在尝试为Google创建供稿文件时会发生超时。 Turns out, if the user has more than 9,000 items to put in their feed file, our code takes at least one minute to execute. 事实证明,如果用户要在其供稿文件中放入9,000多个项目,则我们的代码至少需要一分钟的时间来执行。

I couldn't find the source of the problem by running the debugger, so I fired up a profiler (ANTS) and let it do its thing. 我无法通过运行调试器来找到问题的根源,所以我启动了探查器(ANTS)并让其执行操作。 It found the source of our problem, a foreach loop that contains a bit of linq code. 它找到了问题的根源,一个包含一些linq代码的foreach循环。 Here is the code: 这是代码:

var productMappings = GoogleProductMappingAccess.GetGoogleProductMappingsByID(context, productID);
List<google_Category> retCats = new List<google_Category>(numCategories);
int added = 0;

//this line was flagged by the profiler as taking 48.5% of total run time
foreach (google_ProductMapping pm in (from pm in productMappings orderby pm.MappingType descending select pm))
{
    if (pm.GoogleCategoryId.HasValue && pm.GoogleCategoryId > 0)
    {
        //this line was flagged as 36% of the total time
        retCats.Add(pm.google_Category);
    }

    else if (pm.GoogleCategoryMappingId.HasValue && pm.GoogleCategoryMappingId > 0)
    {
        retCats.Add(pm.google_CategoryMapping.google_Category);
    }
    else
    {
        continue;
    }

    if (++added >= numCategories)
    {
        break;
    }
}

Do any of you more experienced devs have any ideas? 你们中任何较有经验的开发人员有任何想法吗? I was toying with trying to replace all the linq with sql, but I am unsure if that is the best course of action here (if it was written with linq, there must be a reason for it). 我当时正试图用sql替换所有linq,但是我不确定这是否是最好的操作方法(如果它是用linq编写的,则一定有原因)。

If you can filter out the results you don't want anyway your query should be faster - you are using orderby hence all these results use up processing in your query since they all have to be evaluated: 如果您可以过滤出不希望出现的结果,无论如何您的查询都应该更快-您正在使用orderby因此所有这些结果都将用完查询中的处理,因为必须对它们全部进行评估:

 productMappings.Where( pm => (pm.GoogleCategoryMappingId.HasValue
                                && pm.GoogleCategoryMappingId > 0)
                              ||(pm.GoogleCategoryMappingId.HasValue && 
                                 pm.GoogleCategoryMappingId > 0)
                      )
                .OrderBy(...)

Also you should limit the number of results returned by the query since you only use up to numCategories . 另外,您应该限制查询返回的结果数,因为您最多只能使用numCategories So add a 所以加一个

.Take(numCategories)

to your query, instead of checking within the foreach loop. 而不是在foreach循环中进行检查。

The reason retCats.Add(pm.google_Category); retCats.Add(pm.google_Category);的原因retCats.Add(pm.google_Category); takes so long is because you are referecing a lazily loaded object which does another round trip to the server. 之所以花费这么长时间,是因为您引用的是延迟加载的对象,该对象将再次往返服务器。 If you can refactor that so you only take a local copy of the Id instead of the whole object it will speed that part up. 如果您可以重构它,那么您只需获取Id的本地副本,而不是整个对象,这将加快这一部分的速度。

If you do need to take the whole object, then investigate how you can pull it down in a single query when getting the productMappings. 如果确实需要获取整个对象,请研究在获取productMappings时如何在单个查询中将其下拉。 How to do this will depend on what LINQ wrapper you are using on your SQL. 如何执行此操作将取决于您在SQL上使用的LINQ包装器。

Not knowing your database schema it's really hard to tell. 不知道您的数据库架构,这真的很难分辨。 A couple of ideas: 一些想法:

1) Run the query through the Database Engine Tuning Advisor. 1)通过数据库引擎优化顾问运行查询。 Maybe the query needs some indexes? 也许查询需要一些索引?

2) pre-processing this information and putting it in another table or file. 2)预处理此信息,并将其放入另一个表或文件中。 That way when google requests it it won't timeout. 这样,当Google请求时,它不会超时。

This should probably work: 这可能应该工作:

var productMappings = GoogleProductMappingAccess.GetGoogleProductMappingsByID(context, productID);
var categories = from pm in productMappings
                 where pm.GoogleCategoryId > 0 ||
                       pm.GoogleCategoryMappingId > 0
                 orderby pm.MappingType descending
                 select pm.google_Category ??
                        pm.google_CategoryMapping.google_Category;

return categories.Take(numCategories);

It would work best if GetGoogleProductMappingsByID would return an IQueryable (if applicable). 如果GetGoogleProductMappingsByID返回IQueryable (如果适用),则效果最佳。 If so, LINQ will convert the entire statement into a T-SQL command and that would be far faster than in memory LINQ. 如果是这样,LINQ会将整个语句转换为T-SQL命令,这将比内存LINQ快得多。

Feel free to add a .ToList() to the last statement to get it into the same return type as in your code (and to force execution of the LINQ statement). 随意在最后一条语句中添加.ToList(),使其具有与代码中相同的返回类型(并强制执行LINQ语句)。

Checking for both .HasValue and > 0 is useless. 同时检查.HasValue和> 0是没有用的。 Checking for Id > 0 is enough. 检查Id> 0就足够了。
For more info: http://msdn.microsoft.com/en-us/library/2cf62fcy.aspx (operators) 有关更多信息: http : //msdn.microsoft.com/zh-cn/library/2cf62fcy.aspx (运营商)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM