简体   繁体   中英

How do I Optimize Linq queries and updating values for speed

We have a simple calculation engine that takes in a string ( such as "(x + y) * 4") that is stored in the database, pulls the values for x and y from the database, performs the calculation and saves the result to the database. It seems like this is taking way too long, and I am afraid that I have stepped into a Linq pitfall. Please let me know if there are ways to improve this:

 public Nullable<decimal> CalculateFormulaByYearDistrict(int formulaId, int fundingYearId, int districtId)
        {
            string formulaText = "";
            decimal? retValue = null;    

            using (STARSEntities context = new STARSEntities())
            {

                var formulaItems = from fi in context.STARS_FormulaItem
                                   where fi.FormulaId == formulaId
                                   select fi;

                STARS_Formula formula = formulaItems.FirstOrDefault().STARS_Formula;
                formulaText = formula.FormulaText;

                foreach (STARS_FormulaItem formulaItem in formulaItems)
                {
                    int accountingItemId = formulaItem.AccountingItemId;

                    var itemValue = (from iv in context.AccountingItemValues
                                     join di in context.STARS_DistrictInputData
                                     on iv.DomainSpecificId equals di.DistrictInputDataId
                                     where (di.DistrictId == districtId || di.DistrictId == -1) //District -1 is the invalid and universal district for coefficients
                                     && di.DomainYearReportingPeriod.FundingYearId == fundingYearId
                                     && iv.AccountingItemId == accountingItemId
                                     select iv).SingleOrDefault();
                    //If no value exists for the requested Assessment Item Value, then force an error message into the formula text
                    //to be thrown during calculate.
                    if (itemValue != null)
                        formulaText = Regex.Replace(formulaText, @"\b" + formulaItem.ItemCode + @"\b", itemValue.Amount.ToString());
                    else
                        formulaText = Regex.Replace(formulaText, @"\b" + formulaItem.ItemCode + @"\b", "No Value Exists for " + formulaItem.ItemCode);
                }
                switch (formula.FormulaTypeId)
                {
                    case (int)FormulaType.CALC:
                        retValue = Calculate(formulaText);
                        break;
                    case (int)FormulaType.EXPONENT:
                        // pull the number directly after e and and calculate the Math.Exp(value) and then push that into the calculation.
                        retValue = Calculate(ReplaceExponent(formulaText));
                        break;
                    case (int)FormulaType.IFTHEN:
                        // evaluate the If statements and pass any math to the calculator.
                        retValue = Calculate(EvaluateIf(formulaText));
                        break;
                    default:
                        break;
                }
            }            
            return retValue;
        }

public bool CalculateAndSaveResults(DistrictDataCategory category, List<int> districtIds, int fundingYearId, int userId)
        {
            //Optimization Logic
            DateTime startTime = DateTime.Now;
            Debug.WriteLine("Starting Calculate and Save at:" + startTime.ToString());

            using (STARSEntities context = new STARSEntities())
            {

                var formulas = from f in context.STARS_FormulaCategory
                               where f.DistrictDataCategoryId == (int)category
                               select f.STARS_Formula;

                foreach (var districtId in districtIds)
                {
                    Debug.WriteLine("District: " + districtId.ToString());
                    DateTime districtStartTime = DateTime.Now;

                    foreach (var formula in formulas)
                    {
                        var itemValue = (from iv in context.AccountingItemValues
                                         join di in context.STARS_DistrictInputData
                                         on iv.DomainSpecificId equals di.DistrictInputDataId
                                         where (di.DistrictId == districtId)
                                         && di.DomainYearReportingPeriod.FundingYearId == fundingYearId
                                         && iv.AccountingItemId == formula.ResultAccountingItemId
                                         select new { iv, di }).SingleOrDefault();

                        itemValue.iv.Amount = CalculateFormulaByYearDistrict(formula.FormulaId, fundingYearId, districtId);

                        //Update Actual Amount Record
                        itemValue.iv.LastUpdated = DateTime.Now;
                        itemValue.iv.UpdatedBy = userId;

                        //Update District Data Import Record
                        itemValue.di.LastUpdated = DateTime.Now;
                        itemValue.di.UpdatedBy = userId;
                    }
                    Debug.WriteLine("District Calculation took: " + ((TimeSpan)(DateTime.Now - districtStartTime)).ToString() + "for " + districtId.ToString());
                }

                context.SaveChanges();
            }
            Debug.WriteLine("Finished Calculate and Save at:" + ((TimeSpan)(DateTime.Now - startTime)).ToString());
            return true;
        }

Let me know if you need any information about the underlying data structure. Things that seem important is that there is an associative entity between the formula table that stores the formula text so that we can perform all the calculations of a particular type for a given district. The actual values that are stored are in an AccountingItemValue table, but there is an associated table called DistrictInputData that has the location information about the accounting item values.

Thank you very much.

I would start by breaking up the methods and profiling at a more granular level; work out exactly what is causing the performance hit.

It could be that the issue isn't Linq but in the database - have you profiled and optimised your db at all? Do you have sensible indexes and are they being used by the sql that EF is generating?

I don't see anything obviously wrong with your linq queries.

Never underestimate the power of queries inside of loops. Maybe your best bet would be to look at it from a different approach and pull some of those looped queries out. Have you run any timers to see where exactly it is taking the longest? I'd be willing to bet it's those LINQ queries within the foreach loops.

The Linq JOIN operation will loop over the whole database, and then "merge" the results on the ON statement.
Then it will loop over the result and filter by the WHERE statement conditions.

So if :

context.AccountingItemValues = N 
context.STARS_DistrictInputData = M

Then the join operation gives you a result (lets think like SQL for a moment) table of size Max(M,N) (worst case).

Then it will run over the whole result table, and filter the results with the WHERE statement.

So you're looping through the whole database a bit more than twice. And the JOIN operation is not linear, so you get more iterations over the whole thing.

Improve :

Use the table specific where conditions before the join, so you reduce the size of the tables before the join.
That will give you

context.AccountingItemValues = numberOf(accountingItemId)
context.STARS_DistrictInputData = numberOf(fundingYearId)

Then the join operation is done on much smaller tables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM