简体   繁体   English

如何优化Linq查询并更新速度值

[英]How do I Optimize Linq queries and updating values for speed

We have a simple calculation engine that takes in a string ( such as "(x + y) * 4") that is stored in the database, pulls the values for x and y from the database, performs the calculation and saves the result to the database. 我们有一个简单的计算引擎,它接收存储在数据库中的字符串(例如“((x + y)* 4”),从数据库中提取x和y的值,执行计算并将结果保存到数据库。 It seems like this is taking way too long, and I am afraid that I have stepped into a Linq pitfall. 看来这花了太长时间,而且恐怕我已经步入了Linq陷阱。 Please let me know if there are ways to improve this: 请让我知道是否有方法可以改善此问题:

 public Nullable<decimal> CalculateFormulaByYearDistrict(int formulaId, int fundingYearId, int districtId)
        {
            string formulaText = "";
            decimal? retValue = null;    

            using (STARSEntities context = new STARSEntities())
            {

                var formulaItems = from fi in context.STARS_FormulaItem
                                   where fi.FormulaId == formulaId
                                   select fi;

                STARS_Formula formula = formulaItems.FirstOrDefault().STARS_Formula;
                formulaText = formula.FormulaText;

                foreach (STARS_FormulaItem formulaItem in formulaItems)
                {
                    int accountingItemId = formulaItem.AccountingItemId;

                    var itemValue = (from iv in context.AccountingItemValues
                                     join di in context.STARS_DistrictInputData
                                     on iv.DomainSpecificId equals di.DistrictInputDataId
                                     where (di.DistrictId == districtId || di.DistrictId == -1) //District -1 is the invalid and universal district for coefficients
                                     && di.DomainYearReportingPeriod.FundingYearId == fundingYearId
                                     && iv.AccountingItemId == accountingItemId
                                     select iv).SingleOrDefault();
                    //If no value exists for the requested Assessment Item Value, then force an error message into the formula text
                    //to be thrown during calculate.
                    if (itemValue != null)
                        formulaText = Regex.Replace(formulaText, @"\b" + formulaItem.ItemCode + @"\b", itemValue.Amount.ToString());
                    else
                        formulaText = Regex.Replace(formulaText, @"\b" + formulaItem.ItemCode + @"\b", "No Value Exists for " + formulaItem.ItemCode);
                }
                switch (formula.FormulaTypeId)
                {
                    case (int)FormulaType.CALC:
                        retValue = Calculate(formulaText);
                        break;
                    case (int)FormulaType.EXPONENT:
                        // pull the number directly after e and and calculate the Math.Exp(value) and then push that into the calculation.
                        retValue = Calculate(ReplaceExponent(formulaText));
                        break;
                    case (int)FormulaType.IFTHEN:
                        // evaluate the If statements and pass any math to the calculator.
                        retValue = Calculate(EvaluateIf(formulaText));
                        break;
                    default:
                        break;
                }
            }            
            return retValue;
        }

public bool CalculateAndSaveResults(DistrictDataCategory category, List<int> districtIds, int fundingYearId, int userId)
        {
            //Optimization Logic
            DateTime startTime = DateTime.Now;
            Debug.WriteLine("Starting Calculate and Save at:" + startTime.ToString());

            using (STARSEntities context = new STARSEntities())
            {

                var formulas = from f in context.STARS_FormulaCategory
                               where f.DistrictDataCategoryId == (int)category
                               select f.STARS_Formula;

                foreach (var districtId in districtIds)
                {
                    Debug.WriteLine("District: " + districtId.ToString());
                    DateTime districtStartTime = DateTime.Now;

                    foreach (var formula in formulas)
                    {
                        var itemValue = (from iv in context.AccountingItemValues
                                         join di in context.STARS_DistrictInputData
                                         on iv.DomainSpecificId equals di.DistrictInputDataId
                                         where (di.DistrictId == districtId)
                                         && di.DomainYearReportingPeriod.FundingYearId == fundingYearId
                                         && iv.AccountingItemId == formula.ResultAccountingItemId
                                         select new { iv, di }).SingleOrDefault();

                        itemValue.iv.Amount = CalculateFormulaByYearDistrict(formula.FormulaId, fundingYearId, districtId);

                        //Update Actual Amount Record
                        itemValue.iv.LastUpdated = DateTime.Now;
                        itemValue.iv.UpdatedBy = userId;

                        //Update District Data Import Record
                        itemValue.di.LastUpdated = DateTime.Now;
                        itemValue.di.UpdatedBy = userId;
                    }
                    Debug.WriteLine("District Calculation took: " + ((TimeSpan)(DateTime.Now - districtStartTime)).ToString() + "for " + districtId.ToString());
                }

                context.SaveChanges();
            }
            Debug.WriteLine("Finished Calculate and Save at:" + ((TimeSpan)(DateTime.Now - startTime)).ToString());
            return true;
        }

Let me know if you need any information about the underlying data structure. 让我知道您是否需要有关基础数据结构的任何信息。 Things that seem important is that there is an associative entity between the formula table that stores the formula text so that we can perform all the calculations of a particular type for a given district. 似乎很重要的事情是,在公式表之间存在一个存储公式文本的关联实体,以便我们可以为给定地区执行特定类型的所有计算。 The actual values that are stored are in an AccountingItemValue table, but there is an associated table called DistrictInputData that has the location information about the accounting item values. 存储的实际值在AccountingItemValue表中,但是存在一个名为DistrictInputData的关联表,该表具有有关会计项目值的位置信息。

Thank you very much. 非常感谢你。

I would start by breaking up the methods and profiling at a more granular level; 我将首先分解方法并进行更详细的分析。 work out exactly what is causing the performance hit. 找出导致性能下降的确切原因。

It could be that the issue isn't Linq but in the database - have you profiled and optimised your db at all? 可能是问题不是Linq而是在数据库中-您是否已对所有数据库进行概要分析和优化? Do you have sensible indexes and are they being used by the sql that EF is generating? 您是否具有合理的索引,EF生成的SQL是否正在使用它们?

I don't see anything obviously wrong with your linq queries. 我认为您的linq查询没有任何明显的错误。

Never underestimate the power of queries inside of loops. 永远不要低估循环内部查询的功能。 Maybe your best bet would be to look at it from a different approach and pull some of those looped queries out. 也许您最好的选择是从另一种方法来看待它,然后将其中一些循环查询取出。 Have you run any timers to see where exactly it is taking the longest? 您是否运行过任何计时器以查看耗时最长的时间? I'd be willing to bet it's those LINQ queries within the foreach loops. 我愿意打赌,它是foreach循环中的那些LINQ查询。

The Linq JOIN operation will loop over the whole database, and then "merge" the results on the ON statement. Linq JOIN操作将遍历整个数据库,然后将结果“合并”到ON语句中。
Then it will loop over the result and filter by the WHERE statement conditions. 然后它将遍历结果并通过WHERE语句条件进行过滤。

So if : 因此,如果 :

context.AccountingItemValues = N 
context.STARS_DistrictInputData = M

Then the join operation gives you a result (lets think like SQL for a moment) table of size Max(M,N) (worst case). 然后, join操作为您提供了一个大小为Max(M,N)(最坏情况)的结果表(让我们想一想像SQL一样)。

Then it will run over the whole result table, and filter the results with the WHERE statement. 然后它将遍历整个结果表,并使用WHERE语句过滤结果。

So you're looping through the whole database a bit more than twice. 因此,您要遍历整个数据库两次以上。 And the JOIN operation is not linear, so you get more iterations over the whole thing. 而且JOIN操作不是线性的,因此整个过程可以获得更多的迭代。

Improve : 改善

Use the table specific where conditions before the join, so you reduce the size of the tables before the join. 使用特定的地方条件之前加入,让你之前减少表的大小联接表。
That will give you 那会给你

context.AccountingItemValues = numberOf(accountingItemId)
context.STARS_DistrictInputData = numberOf(fundingYearId)

Then the join operation is done on much smaller tables. 然后,在较小的表上执行联接操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM