簡體   English   中英

使用LINQ優化基於值的搜索算法

[英]Optimizing a value based search algorithm with LINQ

我想建立一個基於值的搜索算法。 這意味着一旦給了我單詞列表,我便想使用這些單詞在數據庫中搜索條目。 但是,根據這些單詞匹配的列/屬性,我想更改返回結果的值。

這是一個懶惰的算法,可以達到目的,但是速度很慢。

//search only active entries
var query = (from a in db.Jobs where a.StatusId == 7 select a);
List<SearchResult> baseResult = new List<SearchResult>();
foreach (var item in search)
            {
               //if the company title is matched, results are worth 5 points
                var companyMatches = (from a in query where a.Company.Name.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 5 });

                //if the title is matched results are worth 3 points
                var titleMatches = (from a in query where a.Title.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 3 });

                //if text within the body is matched results are worth 2 points
                var bodyMatches = (from a in query where a.FullDescription.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 2 });


                 //all results are then added 
                baseResult = baseResult.Concat(companyMatches.Concat(titleMatches).Concat(bodyMatches)).ToList();
            }

              // the value gained for each entry is then added and sorted by highest to lowest
            List<SearchResult> result = baseResult.GroupBy(x => x.ID).Select(p => new SearchResult() { ID = p.First().ID, Value = p.Sum(i => i.Value) }).OrderByDescending(a => a.Value).ToList<SearchResult>();

            //the query for the complete result set is built based on the sorted id value of result
            query = (from id in result join jbs in db.Jobs on id.ID equals jbs.ID select jbs).AsQueryable();

我正在尋找優化方法。 我是LINQ查詢的新手,所以我希望可以得到一些幫助。 如果可以的話,我可以創建一次即可實現所有這些功能的LINQ查詢,而不是先檢查公司名稱,標題和正文,然后將它們放在一起並創建一個排序列表,然后再次對數據庫運行它來獲取完整上市,那將是很棒的。

最好先研究這個問題。 我以前的答案是優化錯誤的東西。 這里的主要問題是要遍歷結果列表多次。 我們可以更改它:

foreach (var a in query)
{
    foreach (var item in search)
    {
        itemLower = item.ToLower();
        int val = 0;
        if (a.Company.Name.ToLower.Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 5});
        if (a.Title.ToLower.Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 3});
        if (a.FullDescription.ToLower().Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 2});
    }
}

之后,您便有了基本結果,可以繼續進行處理。

這樣可以將其簡化為單個查詢,而不是每個搜索項的三個查詢。

我不確定您是否要在baseResult唯一項,或者是否由於某些原因允許重復,然后使用值的總和對它們進行排序。 如果需要唯一項,可以將baseResultDictionary ,以ID為鍵。

評論后編輯

您可以通過執行以下操作減少列表中的項目數:

int val = 0;
if (a.Company.Name.ToLower.Contains(itemLower))
    val += 5;
if (a.Title.ToLower.Contains(itemLower))
    val += 3;
if (a.FullDescription.ToLower().Contains(itemLower))
    val += 2;
if (val > 0)
{
    baseResult.Add(new SearchResult { ID = a.ID, Value = val });
}

但是,這將不會完全消除重復項,因為公司名稱可以匹配一個搜索詞,而標題可能匹配另一個搜索詞。 但這會稍微減少列表。

多虧了吉姆(Jim)的回答和我身邊的一些工作,我設法將完成搜索所需的時間減少了80%

這是最終的解決方案:

 //establish initial query
 var queryBase = (from a in db.Jobs where a.StatusId == 7 select a);

//instead of running the search against all of the entities, I first take the ones that are possible candidates, this is done through checking if they have any of the search terms under any of their columns. This is the one and only query that will be run against the database
if (search.Count > 0)
        {

            nquery = nquery.Where(job => search.All(y => (job.Title.ToLower() + " " + job.FullDescription.ToLower() + " " + job.Company.Name.ToLower() + " " + job.NormalLocation.ToLower() + " " + job.MainCategory.Name.ToLower() + " " + job.JobType.Type.ToLower()).Contains(y))); //  + " " + job.Location.ToLower() + " " + job.MainCategory.Name.ToLower() + " " + job.JobType.Type.ToLower().Contains(y)));
        }

        //run the query and grab a list of baseJobs
        List<Job> baseJobs = nquery.ToList<Job>();

        //A list of SearchResult object (these object act as a container for job ids       and their search values
        List<SearchResult> baseResult = new List<SearchResult>();

        //from here on Jim's algorithm comes to play where it assigns points depending on where the search term is located and added to a list of id/value pair list
        foreach (var a in baseJobs)
        {
            foreach (var item in search)
            {
                var itemLower = item.ToLower();

                if (a.Company.Name.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 5 });
                if (a.Title.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 3 });
                if (a.FullDescription.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 2 });
            }
        }

        List<SearchResult> result = baseResult.GroupBy(x => x.ID).Select(p => new SearchResult() { ID = p.First().ID, Value = p.Sum(i => i.Value) }).OrderByDescending(a => a.Value).ToList<SearchResult>();

        //the data generated through the id/value pair list are then used to reorder the initial jobs.
        var NewQuery = (from id in result join jbs in baseJobs on id.ID equals jbs.ID select jbs).AsQueryable();

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM