简体   繁体   English

Lucene中的多个单词查询

[英]Multiple words query in Lucene

For example: There is a column " description " in a Lucene document. 例如:Lucene文档中有一列“ description ”。 Let's say the content of " description " is [ hello foo bar ]. 假设“ description ”的内容为[ hello foo bar ]。 I want a query [ hello f ], then the document should be hit, [ hello ff ] or [ hello b ] should not be hit. 我想要查询[ hello f ],则不应单击该文档,而应单击[ hello ff ]或[ hello b ]。

I use the programmatic way to create the Query , such as PrefixQuery , TermQuery were added to BooleanQuery , but they don't work as expected. 我使用编程方式创建Query ,例如PrefixQueryTermQuery已添加到BooleanQuery ,但它们不能按预期方式工作。 StandardAnalyzer is used. StandardAnalyzer

Test cases: 测试用例:

a): new PrefixQuery(new Term("description", "hello f")) -> 0 hit a): new PrefixQuery(new Term("description", "hello f")) -> 0命中

b): PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f*") ) b): PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f*") ) PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f*") ) -> 0 hit PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f*") ) -> 0击

c): PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f") ) c): PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f") ) PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f") ) -> 0 hit PhraseQuery query = new PhraseQuery(); query.add( new Term("description", "hello f") ) -> 0击

Any recommendations? 有什么建议吗? Thanks! 谢谢!

It doesn't work because you are passing multiple terms to one Term object . 它不起作用,因为您要将多个术语传递给一个Term对象。 If you want all your search words to be prefix-found, you need to : 如果您希望所有搜索词都以前缀查找,则需要:

  1. Tokenize the input string with your analyzer, it will split your search text "hello f" to "hello" and "f": 使用分析器对输入字符串进行标记,它将搜索文本“ hello f”分为“ hello”和“ f”:

    TokenStream tokenStream = analyzer.tokenStream(null, new StringReader(searchText)); TokenStream tokenStream = Analyzer.tokenStream(null,新的StringReader(searchText)); CharTermAttribute termAttribute = tokenStream.getAttribute(CharTermAttribute.class); CharTermAttribute termAttribute = tokenStream.getAttribute(CharTermAttribute.class);

    List tokens = new ArrayList(); 列表令牌= new ArrayList(); while (tokenStream.incrementToken()) { tokens.add(termAttribute.toString()); while(tokenStream.incrementToken()){tokens.add(termAttribute.toString()); } }

  2. Put each token into Term object which in turn needs to be put in PrefixQuery and all PrefixQueries to BooleanQuery 将每个令牌放入Term对象,然后将其放入PrefixQuery并将所有PrefixQueries BooleanQuery

EDIT: For example like this: 编辑:例如这样的:

BooleanQuery booleanQuery = new BooleanQuery();

for(String token : tokens) {        
    booleanQuery.add(new PrefixQuery(new Term(fieldName, token)),  Occur.MUST);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM