简体   繁体   中英

Lucene document structure to correctly group a collection of keys

I have a Java model like this (some fields omitted):

@Searchable(root=true)
class Person {
  @SearchableProperty
  String sex;

  @SearchableProperty
  String name;
}

class Parent extends Person {
  @SearchableComponent
  List<Person> children;
}

This model creates a lucene document with the following data for person Anakin:

$/person/sex:male
$/person/name:anakin
$/person/children/sex:male
$/person/children/name:luke
$/person/children/sex:female
$/person/children/name:leia

Assuming this is only one of many documents, I can search like this:

  1. Find persons with a name starting with an and has a male child

     $/person/name:an* AND $/person/children/sex:male 
  2. Find persons with a male child and a female child

     $/person/children/sex:male AND $/person/children/sex:female 

I run into trouble when trying to find a child with a specific name and sex, like this

$/person/children/sex:male AND $/person/children/name:leia

This will return a result, and I can see why. I would like this to return no results. My question is how can I discriminate or associate these nested properties such that my queries return valid data?

I have considered:

  1. Storing the children as separate documents, though by doing this I lose the ability to search in the way I have written above.

  2. Using an id field somehow in the query to group these fields. I haven't been able to come up with a way which is 'right'. Variants I have considered:

     $/person/children/1/name:luke $/person/children/name:luke1 or $/person/children/name:1luke 

I'm not familiar with Compass, but on the Lucene level you can use BlockJoinQuery to nest the child documents into your parent document and query them.

Mike McCandless has an excellent blog post on using BlockJoinQuery in Lucene 3.4 [1]. That should give you the basic concepts. However in Lucene 4 the API has changed and is now under the org.apache.search.lucene.join package. There is a code example in the Javadoc [2].

[1] http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
[2] http://lucene.apache.org/core/4_10_1/join/org/apache/lucene/search/join/package-summary.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM