简体   繁体   中英

Need help for code analysis Lucene.Net search results asp.net

I was looking for good code for searching index using lucene.net. i got one look promising but i got some confusion. if possible anyone who is familiar with lucene.net then please have look at the code and tell me why the person construct that code in that way.

from where i got the code...url as follows http://www.codeproject.com/Articles/320219/Lucene-Net-ultra-fast-search-for-MVC-or-WebForms

here is code

private static IEnumerable<SampleData> _search
(string searchQuery, string searchField = "") {
// validation
if (string.IsNullOrEmpty(searchQuery.Replace("*", "").Replace("?", ""))) return new List<SampleData>();

// set up lucene searcher
using (var searcher = new IndexSearcher(_directory, false)) {
    var hits_limit = 1000;
    var analyzer = new StandardAnalyzer(Version.LUCENE_29);

    // search by single field
    if (!string.IsNullOrEmpty(searchField)) {
        var parser = new QueryParser(Version.LUCENE_29, searchField, analyzer);
        var query = parseQuery(searchQuery, parser);
        var hits = searcher.Search(query, hits_limit).ScoreDocs;
        var results = _mapLuceneSearchResultsToDataList(hits, searcher);
        analyzer.Close();
        searcher.Close();
        searcher.Dispose();
        return results;
    }
    // search by multiple fields (ordered by RELEVANCE)
    else {
        var parser = new MultiFieldQueryParser
            (Version.LUCENE_29, new[] { "Id", "Name", "Description" }, analyzer);
        var query = parseQuery(searchQuery, parser);
        var hits = searcher.Search
        (query, null, hits_limit, Sort.RELEVANCE).ScoreDocs;
        var results = _mapLuceneSearchResultsToDataList(hits, searcher);
        analyzer.Close();
        searcher.Close(); 
        searcher.Dispose();
        return results;
    }
}
} 

i have couple of question here for the above routine

1) why the developer of this code replace all * & ? to empty string in search term
2) why search once with QueryParser and again by MultiFieldQueryParser
3) how developer detect that search term has one word or many words separated by space.
4) how wild card search can be done using this code....where to change in code for handling wild card.

5) how to handle search for similar word like if anyone search with helo then hello related result should come.

var hits = searcher.Search(query, 1000).ScoreDocs;

6) when my search result will return 5000 record and then if i limit like 1000 then how could i      show next 4000 in pagination fashion.what is the object for giving the limit...i think for    fastness but if i specify limit the how can i show other results....what would be the logic

i will be glad if someone discuss about all my points. thanks

1) why the developer of this code replace all * & ? to empty string in search term

Because those are special characters for wildcard search. What the author does - he checks if a search query has something else along with wildcards. You don't usually want to search for "*", for example.

2) why search once with QueryParser and again by MultiFieldQueryParser

He doesn't search with QueryParsers per se, but he's parsing a search query (string) and making a bunch of objects out of it. Those objects are then consumed by a Searcher object, which performs actual search.

3) how developer detect that search term has one word or many words separated by space.

That's something a Parser object should care about, not the developer.

4) how wild card search can be done using this code....where to change in code for handling wild card.

The wildcards are specified in a searchQuery parameter. Specifying "test*" will count as a wildcard, for example. Details are here .

5) how to handle search for similar word like if anyone search with helo then hello related result should come.

I think you want a fuzzy search.

6) when my search result will return 5000 record and then if i limit like 1000 then how could i show next 4000 in pagination fashion.what is the object for giving the limit...i think for
fastness but if i specify limit the how can i show other results....what would be the logic

Here's an article about that.

UPD: About multiple fields. Logic is following:

  • If searchField is specified, than use simple parser, that will produce query like searchField: value1 seachField: value2... etc .
  • If, however that parameter isn't there, then it assumes, that passed searchQuery will specify fields and values like "field1: value1 field2: value2" . Example is on the same syntax page , as I previously mentioned.

UPD2: Don't hesitate to look for Java documentation and examples for Lucene, as this is initially a Java project (hence, there's a lot of Java examples and tutorials). Lucene.NET is a ported project and both projects share a lot of functionality and classes.

UPD3: About fuzzy search, you might also want to implement your own analyzer for synonyms search (we used that technique in one of commercial projects, which I worked on, to handle common typos along with synonyms).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM