简体   繁体   中英

How do I query a range and a missing in one lucene query?

The query:

start: [ 2012101700 TO * ] OR end: [* TO 2012101700]

will give me the result where the start is after today or the end is before today.

This query will give me all the records with missing start or ends:

 -(end: [ * TO *] OR start: [* TO *])

( The strange bracketing is due to oddities in the query parser, see: Solr query with grouping not working )

However, I want to combine these so that my results are all the records with results in the defined range or missing completely. This query doesn't work though as the [* TO *] spoils it.

(end: [ * TO 2012101700] OR start: [2012101700 TO *])
 OR -(end: [ * TO *] 
 OR start: [* TO *])

Any suggestions?

Thanks

Dave

Lucene doesn't handle 'OR NOT' style queries well.

The reason is how Lucene stores it's data. It doesn't have a table to iterate over, and just filter out anything that matches the given query. I actually has to find documents. The "OR NOT" query, it can find all the documents that matches, and eliminate them, but it can't find the documents that don't match it, because it has no criteria to search for them with.

Another way to think about it might be, when querying a Database you might start with Select * from tablename , and that is the information that you are missing. A way of identifying a set of documents, similar to a table of records, that you are starting with.

A couple of implementations could make something like this work. Either:

  • Store an actual value, a placeholder, for null start and end dates, and just search for that value instead. This is probably the best option.
  • AND the query for null start or end onto a term query you know will match all records your interested in (similar to the query you came to in your other question you mentioned above), such as:

     (end: [ * TO 2012101700] OR start: [2012101700 TO *]) OR (term:GuaranteedHit AND -(end: [ * TO *] OR start: [* TO *])) 

You might need to add a field to accomplish this, which might make the first option more sensible. But, adding a field would allow you to more directly emulate a database-like structure, by allowing you to define a field for use like a table name.

Alternatively, if you use SOLR's uniquekey feature, you could use id:[* TO *] you find all documents, or if you're willing to manually build your queries from objects, you could use a MatchAllDocsQuery .

Also, I would not expect great performance from this second option.

The query:

-(start: [ * TO 2012101700] OR end: [2012101700 TO *])

is actually equivalent to:

(end: [ * TO 2012101700] OR start: [2012101700 TO *])
OR -(end: [ * TO *] 
OR start: [* TO *])

The [* TO *] terms are redundant as other terms include those documents with fields outside the range, INCLUDING not range at all!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM