简体   繁体   中英

Lucene.NET search index approach

I am trying to put together a test case for using Lucene.NET on one of our websites. I'd like to do the following:

Index in a single unique id. Index across a comma delimitered string of terms or tags.

For example.

Item 1: Id = 1 Tags = Something,Separated-Term

I will then be structuring the search so I can look for documents against tag ie

tags:something OR tags:separate-term

I need to maintain the exact term value in order to search against it.

I have something running, and the search query is being parsed as expected, but I am not seeing any results. Here's some code.

My parser (_luceneAnalyzer is passed into my indexing service):

var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_CURRENT, "Tags", _luceneAnalyzer);
parser.SetDefaultOperator(QueryParser.Operator.AND);
return parser;

My Lucene.NET document creation:

var doc = new Document();

var id = new Field(
    "Id",
    NumericUtils.IntToPrefixCoded(indexObject.id),
    Field.Store.YES,
    Field.Index.NOT_ANALYZED,
    Field.TermVector.NO);

var tags = new Field(
    "Tags",
    string.Join(",", indexObject.Tags.ToArray()),
    Field.Store.NO,
    Field.Index.ANALYZED,
    Field.TermVector.YES);

doc.Add(id);
doc.Add(tags);

return doc;

My search:

var parser = BuildQueryParser();
var query = parser.Parse(searchQuery);
var searcher = Searcher;

TopDocs hits = searcher.Search(query, null, max);
IList<SearchResult> result = new List<SearchResult>();
float scoreNorm = 1.0f / hits.GetMaxScore();

for (int i = 0; i < hits.scoreDocs.Length; i++)
{
    float score = hits.scoreDocs[i].score * scoreNorm;
    result.Add(CreateSearchResult(searcher.Doc(hits.scoreDocs[i].doc), score));
}

return result;

I have two documents in my index, one with the tag "Something" and one with the tags "Something" and "Separated-Term". It's important for the - to remain in the terms as I want an exact match on the full value.

When I search with "tags:Something" I do not get any results.

Question

What Analyzer should I be using to achieve the search index I am after? Are there any pointers for putting together a search such as this? Why is my current search not returning any results?

Many thanks

A few ideas to think about:

  1. Try the search "Tags:Something" (you had lowercased the field name "Tags" on your example)
  2. I think you'll need a per-field analyser: one for "Id" and one for "Tags
  3. Luke is a really good tool for examining indices and queries (it works fine for Lucene.net created data)

Hope this helps,

It appears you can add multiple fields with the same name to a document so I changed my code to:

foreach (string tag in vehicle.Tags)
{
    var tags = new Field(
        TAGS,
        tag,
        Field.Store.YES,
        Field.Index.ANALYZED,
        Field.TermVector.YES);

    doc.Add(tags);
}

I can now search by single or multiple tags in the "Tags" field.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM