简体   繁体   English

需要帮助进行代码分析Lucene.Net搜索结果asp.net

[英]Need help for code analysis Lucene.Net search results asp.net

I was looking for good code for searching index using lucene.net. 我正在寻找使用lucene.net搜索索引的好代码。 i got one look promising but i got some confusion. 我有一个看起来很有希望,但我有一些困惑。 if possible anyone who is familiar with lucene.net then please have look at the code and tell me why the person construct that code in that way. 如果可能的话,任何熟悉lucene.net的人都请查看代码并告诉我为什么这个人以这种方式构造代码。

from where i got the code...url as follows http://www.codeproject.com/Articles/320219/Lucene-Net-ultra-fast-search-for-MVC-or-WebForms 从哪里得到代码... url如下http://www.codeproject.com/Articles/320219/Lucene-Net-ultra-fast-search-for-MVC-or-WebForms

here is code 这是代码

private static IEnumerable<SampleData> _search
(string searchQuery, string searchField = "") {
// validation
if (string.IsNullOrEmpty(searchQuery.Replace("*", "").Replace("?", ""))) return new List<SampleData>();

// set up lucene searcher
using (var searcher = new IndexSearcher(_directory, false)) {
    var hits_limit = 1000;
    var analyzer = new StandardAnalyzer(Version.LUCENE_29);

    // search by single field
    if (!string.IsNullOrEmpty(searchField)) {
        var parser = new QueryParser(Version.LUCENE_29, searchField, analyzer);
        var query = parseQuery(searchQuery, parser);
        var hits = searcher.Search(query, hits_limit).ScoreDocs;
        var results = _mapLuceneSearchResultsToDataList(hits, searcher);
        analyzer.Close();
        searcher.Close();
        searcher.Dispose();
        return results;
    }
    // search by multiple fields (ordered by RELEVANCE)
    else {
        var parser = new MultiFieldQueryParser
            (Version.LUCENE_29, new[] { "Id", "Name", "Description" }, analyzer);
        var query = parseQuery(searchQuery, parser);
        var hits = searcher.Search
        (query, null, hits_limit, Sort.RELEVANCE).ScoreDocs;
        var results = _mapLuceneSearchResultsToDataList(hits, searcher);
        analyzer.Close();
        searcher.Close(); 
        searcher.Dispose();
        return results;
    }
}
} 

i have couple of question here for the above routine 对于上述例程,我有几个问题

1) why the developer of this code replace all * & ? to empty string in search term
2) why search once with QueryParser and again by MultiFieldQueryParser
3) how developer detect that search term has one word or many words separated by space.
4) how wild card search can be done using this code....where to change in code for handling wild card.

5) how to handle search for similar word like if anyone search with helo then hello related result should come.

var hits = searcher.Search(query, 1000).ScoreDocs;

6) when my search result will return 5000 record and then if i limit like 1000 then how could i      show next 4000 in pagination fashion.what is the object for giving the limit...i think for    fastness but if i specify limit the how can i show other results....what would be the logic

i will be glad if someone discuss about all my points. 如果有人讨论我的所有观点,我会很高兴的。 thanks 谢谢

1) why the developer of this code replace all * & ? 1)为什么这段代码的开发者会替换所有*&? to empty string in search term 在搜索词中清空字符串

Because those are special characters for wildcard search. 因为这些是通配符搜索的特殊字符。 What the author does - he checks if a search query has something else along with wildcards. 作者做了什么 - 他检查搜索查询是否还有通配符。 You don't usually want to search for "*", for example. 例如,您通常不想搜索“*”。

2) why search once with QueryParser and again by MultiFieldQueryParser 2)为什么用QueryParser和MultiFieldQueryParser再次搜索一次

He doesn't search with QueryParsers per se, but he's parsing a search query (string) and making a bunch of objects out of it. 他本身不会使用QueryParsers进行搜索,但他正在解析搜索查询(字符串)并从中创建一堆对象。 Those objects are then consumed by a Searcher object, which performs actual search. 然后, Searcher对象使用这些对象,该对象执行实际搜索。

3) how developer detect that search term has one word or many words separated by space. 3)开发人员如何检测搜索词有一个词或多个用空格分隔的词。

That's something a Parser object should care about, not the developer. 这是Parser对象应该关心的东西,而不是开发人员。

4) how wild card search can be done using this code....where to change in code for handling wild card. 4)如何使用此代码完成外卡搜索....在哪里更改处理外卡的代码。

The wildcards are specified in a searchQuery parameter. 通配符在searchQuery参数中指定。 Specifying "test*" will count as a wildcard, for example. 例如,指定“test *”将计为通配符。 Details are here . 细节在这里

5) how to handle search for similar word like if anyone search with helo then hello related result should come. 5)如何处理搜索类似的单词,如果有人用helo搜索那么你好相关的结果应该来了。

I think you want a fuzzy search. 我想你想要模糊搜索。

6) when my search result will return 5000 record and then if i limit like 1000 then how could i show next 4000 in pagination fashion.what is the object for giving the limit...i think for 6)当我的搜索结果将返回5000记录然后如果我限制为1000然后我怎么能以分页方式显示下一个4000.什么是给予限制的对象...我想
fastness but if i specify limit the how can i show other results....what would be the logic 坚定但如果我指定限制我怎么能显示其他结果....什么是逻辑

Here's an article about that. 这是一篇关于此的文章

UPD: About multiple fields. UPD:关于多个领域。 Logic is following: 逻辑如下:

  • If searchField is specified, than use simple parser, that will produce query like searchField: value1 seachField: value2... etc . 如果指定了searchField ,那么使用简单的解析器,将产生类似searchField: value1 seachField: value2... etc查询searchField: value1 seachField: value2... etc
  • If, however that parameter isn't there, then it assumes, that passed searchQuery will specify fields and values like "field1: value1 field2: value2" . 但是,如果该参数不存在,那么它假设,传递的searchQuery将指定字段和值,如"field1: value1 field2: value2" Example is on the same syntax page , as I previously mentioned. 示例与我之前提到的语法页面相同。

UPD2: Don't hesitate to look for Java documentation and examples for Lucene, as this is initially a Java project (hence, there's a lot of Java examples and tutorials). UPD2:不要犹豫为Lucene寻找Java文档和示例,因为这最初是一个Java项目(因此,有很多Java示例和教程)。 Lucene.NET is a ported project and both projects share a lot of functionality and classes. Lucene.NET是一个移植项目,两个项目共享许多功能和类。

UPD3: About fuzzy search, you might also want to implement your own analyzer for synonyms search (we used that technique in one of commercial projects, which I worked on, to handle common typos along with synonyms). UPD3:关于模糊搜索,您可能还想实现自己的分析器进行同义词搜索(我们在其中一个商业项目中使用了该技术,我在其中处理常见的拼写错误以及同义词)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM