简体   繁体   English

Lucene.NET搜索索引方法

[英]Lucene.NET search index approach

I am trying to put together a test case for using Lucene.NET on one of our websites. 我试图在我们的一个网站上整理一个使用Lucene.NET的测试用例。 I'd like to do the following: 我想做以下事情:

Index in a single unique id. 索引一个唯一的ID。 Index across a comma delimitered string of terms or tags. 跨逗号分隔的术语或标签字符串的索引。

For example. 例如。

Item 1: Id = 1 Tags = Something,Separated-Term 第1项:Id = 1标签=某些东西,分项

I will then be structuring the search so I can look for documents against tag ie 然后,我将构建搜索结构,以便可以根据标签查找文档,即

tags:something OR tags:separate-term 标签:某物或标签:分隔项

I need to maintain the exact term value in order to search against it. 我需要维护确切的术语值以便对其进行搜索。

I have something running, and the search query is being parsed as expected, but I am not seeing any results. 我正在运行某些程序,并且正在按预期方式解析搜索查询,但是没有看到任何结果。 Here's some code. 这是一些代码。

My parser (_luceneAnalyzer is passed into my indexing service): 我的解析器(_luceneAnalyzer被传递到我的索引服务中):

var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_CURRENT, "Tags", _luceneAnalyzer);
parser.SetDefaultOperator(QueryParser.Operator.AND);
return parser;

My Lucene.NET document creation: 我的Lucene.NET文档创建:

var doc = new Document();

var id = new Field(
    "Id",
    NumericUtils.IntToPrefixCoded(indexObject.id),
    Field.Store.YES,
    Field.Index.NOT_ANALYZED,
    Field.TermVector.NO);

var tags = new Field(
    "Tags",
    string.Join(",", indexObject.Tags.ToArray()),
    Field.Store.NO,
    Field.Index.ANALYZED,
    Field.TermVector.YES);

doc.Add(id);
doc.Add(tags);

return doc;

My search: 我的搜索:

var parser = BuildQueryParser();
var query = parser.Parse(searchQuery);
var searcher = Searcher;

TopDocs hits = searcher.Search(query, null, max);
IList<SearchResult> result = new List<SearchResult>();
float scoreNorm = 1.0f / hits.GetMaxScore();

for (int i = 0; i < hits.scoreDocs.Length; i++)
{
    float score = hits.scoreDocs[i].score * scoreNorm;
    result.Add(CreateSearchResult(searcher.Doc(hits.scoreDocs[i].doc), score));
}

return result;

I have two documents in my index, one with the tag "Something" and one with the tags "Something" and "Separated-Term". 我的索引中有两个文档,一个带有标签“ Something”,另一个带有标签“ Something”和“ Separated-Term”。 It's important for the - to remain in the terms as I want an exact match on the full value. 重要的是-保留条款,因为我希望完全匹配全部价值。

When I search with "tags:Something" I do not get any results. 当我使用“ tags:Something”搜索时,没有得到任何结果。

Question

What Analyzer should I be using to achieve the search index I am after? 我应该使用什么分析器来获得我想要的搜索索引? Are there any pointers for putting together a search such as this? 是否有任何指针可以将这样的搜索组合在一起? Why is my current search not returning any results? 为什么我当前的搜索未返回任何结果?

Many thanks 非常感谢

A few ideas to think about: 需要考虑的一些想法:

  1. Try the search "Tags:Something" (you had lowercased the field name "Tags" on your example) 尝试搜索“ Tags:Something”(您在示例中将字段名称“ Tags”小写了)
  2. I think you'll need a per-field analyser: one for "Id" and one for "Tags 我认为您需要一个按字段分析器:一个用于“ Id”,一个用于“ Tags”
  3. Luke is a really good tool for examining indices and queries (it works fine for Lucene.net created data) Luke是检查索引和查询的一个非常好的工具(对于Lucene.net创建的数据很好用)

Hope this helps, 希望这可以帮助,

It appears you can add multiple fields with the same name to a document so I changed my code to: 看来您可以将多个具有相同名称的字段添加到文档中,所以我将代码更改为:

foreach (string tag in vehicle.Tags)
{
    var tags = new Field(
        TAGS,
        tag,
        Field.Store.YES,
        Field.Index.ANALYZED,
        Field.TermVector.YES);

    doc.Add(tags);
}

I can now search by single or multiple tags in the "Tags" field. 现在,我可以在“标签”字段中按单个或多个标签进行搜索。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM