简体   繁体   English

自定义Lucene.net搜索结果

[英]Customize Lucene.net search result

first time i am using lucene.net. 第一次使用lucene.net。 it is working fine. 它工作正常。 i search the data with jcb geo keyword and result is coming. 我用jcb geo关键字搜索数据,结果即将到来。 my first result is coming related with JCB keyword and next few data is coming with GEO keyword. 我的第一个结果与JCB关键字相关,接下来的一些数据与GEO关键字相关。 i just do not understand why JCB is coming at top. 我只是不明白为什么JCB排在首位。 on the other hand maximum result is related with GEO. 另一方面,最大结果与GEO有关。 i think GEO related data should come at top and then JCB should come. 我认为与GEO相关的数据应该放在首位,然后JCB应该出现。

here i am giving the code which i used to search. 在这里,我提供了我曾经搜索的代码。

        string multiWordPhrase = "";
        multiWordPhrase = txtSearch.Text.Trim().Replace("*", "").Replace("?", "").Replace("~", "");
        IndexSearcher searcher = null;
        List<SearchResult> list = new List<SearchResult>();
        SearchResult oSr = null;

        if (!string.IsNullOrEmpty(multiWordPhrase))
        {
            string[] fieldList = { "Title", "Description", "Url" };
            List<BooleanClause.Occur> occurs = new List<BooleanClause.Occur>();
            foreach (string field in fieldList)
            {
                occurs.Add(BooleanClause.Occur.SHOULD);
            }

            searcher = new IndexSearcher(_directory, false);
            Query qry = MultiFieldQueryParser.Parse(Version.LUCENE_29, multiWordPhrase, fieldList, occurs.ToArray(), new StandardAnalyzer(Version.LUCENE_29));
            TopDocs topDocs = searcher.Search(qry, null, ((PageIndex + 1) * PageSize), Sort.RELEVANCE);
            ScoreDoc[] scoreDocs = topDocs.ScoreDocs;
            int resultsCount = topDocs.TotalHits;

            if (topDocs != null)
            {
                for (int i = (PageIndex * PageSize); i <= ((PageIndex + 1) * PageSize) && i < topDocs.ScoreDocs.Length; i++)
                {
                    Document doc = searcher.Doc(topDocs.ScoreDocs[i].doc);
                    oSr = new SearchResult();
                    oSr.ID = doc.Get("ID");
                    oSr.Title = doc.Get("Title");
                    oSr.Description = doc.Get("Description");
                    //oSr.WordCount = AllExtension.WordCount(oSr.Description, WordExist(oSr.Title, multiWordPhrase));
                    string preview =
                    oSr.Description = AllExtension.HighlightKeywords(oSr.Description, multiWordPhrase);  //sr.Description;
                    oSr.Url = doc.Get("Url");
                    list.Add(oSr);
                }
            }
            lblMatchFound.Text = "Match Found " + resultsCount.ToString();

            Pagination pagination = new Pagination();
            pagination.BaseUrl = "/Search.aspx";
            pagination.TotalRows = resultsCount;
            pagination.CurPage = (PageIndex+1);
            pagination.PerPage = PageSize;
            pagination.PrevLink = "Prev";
            pagination.NextLink = "Next";
            pagination.SearchTerm = multiWordPhrase;
            lblPager.Text = pagination.GetPageLinks(); ;

            rptResult.DataSource = list;
            rptResult.DataBind();
            searcher.Close();

在此处输入图片说明

if it would be possible then please discuss why JCB related data is coming at top and also tell me how could i customize search result as a result those records should come at top which has go maximum search term word.....like GEO. 如果可能的话,请讨论为什么与JCB相关的数据排在最前面,并告诉我如何定制搜索结果,因为那些记录应该排在顶部,而这些记录的搜索词词最多。 so please suggest how to customize my search result and if possible then please come with some sample code because i am new in lucene.net as a result i can better visualize. 因此,请提出如何自定义搜索结果的建议,如果可能的话,请提供一些示例代码,因为我是lucene.net的新手,因此我可以更好地可视化。 thanks a lot 非常感谢

You will need to understand the scoring formula that LB linked to have a better understanding of the score, and you will need to implement your own Similarity if you want to modify it. 您将需要了解LB链接的得分公式,以便对分数有更好的了解,并且如果要修改它,则需要实现自己的相似性。

In your case, what probably happens is that the JCB term is a lot less popular than the GEO term. 在您的情况下, 可能发生的情况是,JCB术语不如GEO术语受欢迎。 It could also be that documents containing the JCB term are shorter. 也可能是包含JCB术语的文档较短。

Additionnaly you can also use the Explain method of the IndexSearcher to see how a doc was scored: http://lucene.apache.org/core/old_versioned_docs/versions/2_9_4/api/all/org/apache/lucene/search/IndexSearcher.html#explain(org.apache.lucene.search.Weight, int) 此外,您还可以使用IndexSearcher的Explain方法来查看文档的评分方式: http : //lucene.apache.org/core/old_versioned_docs/versions/2_9_4/api/all/org/apache/lucene/search/IndexSearcher .html#explain(org.apache.lucene.search.Weight,int)

You can also use Luke for that: http://code.google.com/p/luke/downloads/list 您也可以使用Luke: http//code.google.com/p/luke/downloads/list

With Luke, you do a search, select a result and click the Explain button to show an explanation of the hit. 使用Luke,您可以进行搜索,选择结果并单击“解释”按钮以显示有关命中的说明。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM