简体   繁体   中英

Lucene .NET search results

I'm using this code to index:

public void IndexEmployees(IEnumerable<Employee> employees)
{
    var indexPath = GetIndexPath();
    var directory = FSDirectory.Open(indexPath);

    var indexWriter = new IndexWriter(directory, new StandardAnalyzer(Version.LUCENE_29), true, IndexWriter.MaxFieldLength.UNLIMITED);

    foreach (var employee in employees)
    {
        var document = new Document();
        document.Add(new Field("EmployeeId", employee.EmployeeId.ToString(), Field.Store.YES, Field.Index.NO, Field.TermVector.NO));
        document.Add(new Field("Name", employee.FirstName + " " + employee.LastName, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));
        document.Add(new Field("OfficeName", employee.OfficeName, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));
        document.Add(new Field("CompetenceRatings", string.Join(" ", employee.CompetenceRatings.Select(cr => cr.Name)), Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));

        indexWriter.AddDocument(document);
    }

    indexWriter.Optimize();
    indexWriter.Close();

    var indexReader = IndexReader.Open(directory, true);
    var spell = new SpellChecker.Net.Search.Spell.SpellChecker(directory);
    spell.ClearIndex();

    spell.IndexDictionary(new LuceneDictionary(indexReader, "Name"));
    spell.IndexDictionary(new LuceneDictionary(indexReader, "OfficeName"));
    spell.IndexDictionary(new LuceneDictionary(indexReader, "CompetenceRatings"));
}

public DirectoryInfo GetIndexPath()
{
    return new DirectoryInfo(HttpContext.Current.Server.MapPath("/App_Data/EmployeeIndex/"));
}

And this code to find results (as well as suggestions):

public SearchResult Search(DirectoryInfo indexPath, string[] searchFields, string searchQuery)
{
    var directory = FSDirectory.Open(indexPath);

    var standardAnalyzer = new StandardAnalyzer(Version.LUCENE_29);

    var indexReader = IndexReader.Open(directory, true);
    var indexSearcher = new IndexSearcher(indexReader);

    var parser = new MultiFieldQueryParser(Version.LUCENE_29, searchFields, standardAnalyzer);
    //parser.SetDefaultOperator(QueryParser.Operator.OR);
    var query = parser.Parse(searchQuery);

    var hits = indexSearcher.Search(query, null, 5000);

    return new SearchResult
                {
                    Suggestions = FindSuggestions(indexPath, searchQuery),
                    LuceneDocuments = hits
                        .scoreDocs
                        .Select(scoreDoc => indexSearcher.Doc(scoreDoc.doc))
                        .ToArray()
                };
}

public string[] FindSuggestions(DirectoryInfo indexPath, string searchQuery)
{
    var directory = FSDirectory.Open(indexPath);

    var spell = new SpellChecker.Net.Search.Spell.SpellChecker(directory);

    var similarWords = spell.SuggestSimilar(searchQuery, 20);

    return similarWords;
}

var searchResult = Search(GetIndexPath(), new[] { "Name", "OfficeName", "CompetenceRatings" }, "admin*");

Simple queries like: admin or admin* doesnt give me any results. I know that there is an employee with that name. I want to be able to find James Jameson if I search for James.

Thanks!

First thing. You have to commit the changes to the index.

indexWriter.Optimize();
indexWriter.Commit(); //Add This
indexWriter.Close();

Edit#2 Also, keep it simple until you get something that works.

Comment this stuff out.

//var indexReader = IndexReader.Open(directory, true);
//var spell = new SpellChecker.Net.Search.Spell.SpellChecker(directory);
//spell.ClearIndex();

//spell.IndexDictionary(new LuceneDictionary(indexReader, "Name"));
//spell.IndexDictionary(new LuceneDictionary(indexReader, "OfficeName"));
//spell.IndexDictionary(new LuceneDictionary(indexReader, "CompetenceRatings"));

Edit#3

The fields you are searching are probably not going to change often. I would include them in your search function.

string[] fields = new string[] { "Name", "OfficeName", "CompetenceRatings" };

The biggest reason I suggest this is that Fields are case-sensitive and sometimes you wont get any results and it's because you search the "name" field (which doesn't exist) instead of the "Name" field. Easier to spot the mistake this way.

In my (limited) experience working with Lucene, I've found that you have to build up your own query in order to get "google" like behavior. Here is what I do, YMMV, but it generates expected results in my application. The basic idea is you combine a term query (exact match), a prefix query (anything that begins with the term), and a fuzzy query for each term in the search string. The code below won't compile, but gives you the idea

Query GetQuery(string querystring)
{

   Search.Search.BooleanQuery query = new Search.Search.BooleanQuery();

   Search.Analysis.TokenStream tk = StandardAnalyzerInstance.TokenStream(null, new StringReader(querystring));
   Search.Analysis.Tokenattributes.TermAttribute ta = tk.GetAttribute(typeof(Search.Analysis.Tokenattributes.TermAttribute)) as Search.Analysis.Tokenattributes.TermAttribute;

    while (tk.IncrementToken())
    {
         string term = ta.Term();
         Search.Search.BooleanQuery bq = new Search.Search.BooleanQuery();
         bq.Add(new Search.Search.TermQuery(new Search.Index.Term("fieldToQuery", term)), Search.Search.BooleanClause.Occur.SHOULD);
         bq.Add(new Search.Search.PrefixQuery(new Search.Index.Term("fieldToQuery", term)), Search.Search.BooleanClause.Occur.SHOULD);
         bq.Add(new Search.Search.FuzzyQuery(new Search.Index.Term("fieldToQuery", term)), Search.Search.BooleanClause.Occur.SHOULD);
         query.Add(bq, Search.Search.BooleanClause.Occur.MUST);
    }

    return query;
}

That Parse() method is inherited. Have you tried utilizing the static methods that returns a Query object?

Parse(Version matchVersion, String[] queries, String[] fields, Analyzer analyzer)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM