简体   繁体   English

Hibernate-Search - 使用 lucene 查询解析器语法的不区分大小写的通配符搜索(不使用 QueryBuilder!)

[英]Hibernate-Search - Case insensitive wildcard search using lucene query parser syntax (not using QueryBuilder!)

Firstly, here is my Hibernate-Search indexing time setup:首先,这是我的 Hibernate-Search 索引时间设置:

// ...
@Indexed(index = "XXXRequestIndex")
@AnalyzerDef(name = "toLowercaseAnalyzer",
    tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
    filters = {
        @TokenFilterDef(factory = LowerCaseFilterFactory.class)
    })
public class XXXRequest implements Serializable {
    // ...
    @Field(analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "toLowercaseAnalyzer"))
    @SortableField
    private String status;
    // ...
}

I saw this thread where .overridesForField(...) is setup on QueryBuilder for a field at query time in order to query case-insensitive with wildcards: Hibernate Search |我看到了这个线程,其中.overridesForField(...)在查询时在QueryBuilder上设置为一个字段,以便使用通配符查询不区分大小写的内容: Hibernate Search | ngram analyzer with minGramSize 1 具有 minGramSize 1 的 ngram 分析器

I need to do something similar for a particular field only ("status"), but I am NOT using a QueryBuilder , instead I am parsing an incoming lucene query string using an MultiFieldQueryParser .我只需要对特定字段(“状态”)执行类似的操作,但我没有使用QueryBuilder ,而是使用MultiFieldQueryParser解析传入的 lucene 查询字符串。 I cannot change that to switch to build a query using a QueryBuilder because it is important for the callers of the code to issue their own dynamic queries using the lucene query parser syntax (like described more or less in https://lucene.apache.org/core/2_9_4/queryparsersyntax.html )我无法更改它以切换到使用QueryBuilder构建查询,因为代码调用者使用 lucene 查询解析器语法发出他们自己的动态查询很重要(就像在https://lucene.apache 中或多或少描述的那样。 org/core/2_9_4/queryparsersyntax.html )

So when the caller sends as lucene query status:*n\\ Pr* , it does not match "In Processing".因此,当调用方发送 lucene query status:*n\\ Pr* ,它与“In Processing”不匹配。 However a query like status:*n\\ pr* does match "In Processing".然而,像status:*n\\ pr*这样的查询确实匹配“处理中”。

My query code:我的查询代码:

Analyzer analyzer = new KeywordAnalyzer();
String[] fields = ...
MultiFieldQueryParser queryParser = new MultiFieldQueryParser(
    fields,
    analyser);

Query luceneQuery = queryParser.parse(luceneFilterString);
List results = fullTextQuery.getResultList();

How can I make the query be case-insensitive?如何使查询不区分大小写?

Wildcard searches are not analyzed by the MultiFieldQueryParser so you have to do the filtering yourself (so you need to apply the filters manually to your input string). MultiFieldQueryParser不会分析通配符搜索,因此您必须自己进行过滤(因此您需要手动将过滤器应用于输入字符串)。

Another option would be to use the simple query string feature (see https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#_simple_query_string_queries ), which can apply analysis to this sort of query but it only supports prefix queries (so you can search for process* but not *cessing ).另一种选择是使用简单的查询字符串功能(参见https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#_simple_query_string_queries ),它可以将分析应用于此类查询但它仅支持前缀查询(因此您可以搜索process*而不是*cessing )。 If this limitation is acceptable for you, I really recommend this approach.如果你可以接受这个限制,我真的推荐这种方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM