简体   繁体   English

在Lucene中将数字范围查询与术语查询相结合

[英]Combining Numeric Range Query with Term Query in Lucene

I would like to combine a numeric range query with a term query in Lucene. 我想在Lucene中将数值范围查询与术语查询相结合。 For example, I want to search for documents that I have indexed that contain between 10 and 20 pages and have the title "Hello World". 例如,我想搜索已编入索引的文档,其中包含10到20页,标题为“Hello World”。

It does not seem possibly to use the QueryParser to generate this query for me; 似乎不可能使用QueryParser为我生成此查询; the range query that the QueryParser generates appears to be a text one. QueryParser生成的范围查询似乎是文本查询。

I definitely would appreciate an example of how to combine a numeric range query with a term query. 我绝对会欣赏如何将数值范围查询与术语查询相结合的示例。 I would also be open taking an alternative to searching my index. 我也会公开采取替代方法来搜索我的索引。

Thanks 谢谢

Well it looks like I figured this one out on my own. 嗯,看起来我自己想出了这个。 You can use Query.combine() to OR queries together. 您可以将Query.combine()一起用于OR查询。 I have included an example below. 我在下面列举了一个例子。

String termQueryString = "title:\"hello world\"";
Query termQuery = parser.parse(termQueryString);

Query pageQueryRange = NumericRangeQuery.newIntRange("page_count", 10, 20, true, true);

Query query = termQuery.combine(new Query[]{termQuery, pageQueryRange});

You can also create a custom QueryParser overriding protected Query getRangeQuery(...) method, which should return NumericRangeQuery instance when "page_count" field is encountered. 您还可以创建一个自定义QueryParser覆盖protected Query getRangeQuery(...)方法,该方法应在遇到"page_count"字段时返回NumericRangeQuery实例。

Like so... 像这样......

public class CustomQueryParser extends QueryParser {

    public CustomQueryParser(Version matchVersion, String f, Analyzer a) {
        super(matchVersion, f, a);
    }

    @Override
    protected Query getRangeQuery(final String field, final String part1, final String part2, final boolean inclusive) throws ParseException {

        if ("page_count".equals(field)) {
            return NumericRangeQuery.newIntRange(field, Integer.parseInt(part1), Integer.parseInt(part2), inclusive, inclusive);
        }

        // return default
        return super.getRangeQuery(field, part1, part2, inclusive);    
    }
}

Then use CustomQueryParser when parsing textual queries.. 然后在解析文本查询时使用CustomQueryParser

Like so... 像这样......

...
final QueryParser parser = new CustomQueryParser(Version.LUCENE_35, "some_default_field", new StandardAnalyzer(Version.LUCENE_35));
final Query q = parser.parse("title:\"hello world\" AND page_count:[10 TO 20]");
...

This all, of course, assumes that NumericField(...).setIntValue(...) was used when page_count values were added to documents 当然,这一切都假设在将page_count值添加到文档时使用了NumericField(...).setIntValue(...)

You may use BooleanQuery : 您可以使用BooleanQuery

var combinedQuery = new BooleanQuery();
combinedQuery.Add(new TermQuery(new Term("title","hello world")),Occur.MUST);
combinedQuery.Add(NumericRangeQuery.newIntRange("page_count", 10, 20, true, true),Occur.MUST);
RangeQuery amountQuery = new RangeQuery(lowerTerm, upperTerm, true);

Lucene treats numbers as words, so the numbers are ordered alphabetically. Lucene将数字视为单词,因此数字按字母顺序排序。

1
12
123
1234
etc.

That being said, you can still use the range query, you just need to be more clever about it. 话虽这么说,你仍然可以使用范围查询,你只需要更聪明一点。

In order to query numeric values correctly, you need to pad your integers so the same lengths (whatever your maximum supported value is) 为了正确查询数值,您需要填充整数以使长度相同(无论您支持的最大值是多少)

0001
0012
0123
1234

Obviously, this doesn't work for negative numbers (since -2 < -1), and hopefully you won't have to deal with them. 显然,这对负数不起作用(因为-2 <-1),希望你不必处理它们。 Here's a useful article for negatives if you do encounter them: http://wiki.apache.org/lucene-java/SearchNumericalFields 如果您遇到负面消息,这里有一篇有用的文章: http//wiki.apache.org/lucene-java/SearchNumericalFields

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM