[英]Solr documents with child elements?
Is it somehow possible to create a solr document that contains sub-elements? 是否有可能创建包含子元素的solr文档?
For example, how would I represent something like this: 例如,我将如何表示这样的事情:
<person first="Bob" last="Smith">
<children>
<child first="Little" last="Smith" />
<child first="Junior" last="Smith" />
</children>
</person>
What is the usual way to solve this problem? 解决这个问题的常用方法是什么?
As of Solr 4.7 and 4.8, Solr supports nested documents: 从Solr 4.7和4.8开始,Solr支持嵌套文档:
{
"id": "chapter1",
"title" : "Indexing Child Documents in JSON",
"content_type": "chapter",
"_childDocuments_": [
{
"id": "1-1",
"content_type": "page",
"text": "ho hum... this is page 1 of chapter 1"
},
{
"id": "1-2",
"content_type": "page",
"text": "more text... this is page 2 of chapter 1"
}
]
}
See the Solr release notes for more. 有关更多信息,请参阅Solr发行说明 。
You can model this in different ways, depending on your searching/faceting needs. 您可以根据您的搜索/分面需求以不同方式对此进行建模。 Usually you'll use multivalued or dynamic fields.
通常,您将使用多值或动态字段。 In the next examples I'll omit the field type, indexed and stored flags:
在下面的例子中,我将省略字段类型,索引和存储的标志:
<field name="first"/>
<field name="last"/>
<field name="child_first" multiValued="true"/>
<field name="child_last" multiValued="true"/>
It's up to you to correlate the children first names and last names. 由您来关联孩子的名字和姓氏。 Or you could just put both in a single field:
或者你可以把它们放在一个字段中:
<field name="first"/>
<field name="last"/>
<field name="child_first_and_last" multiValued="true"/>
Another one: 另一个:
<field name="first"/>
<field name="last"/>
<dynamicField name="child_first_*"/>
<dynamicField name="child_last_*"/>
Here you would store fields 'child_first_1', 'child_last_1', 'child_first_2', 'child_last_2', etc. Again it's up to you to correlate values, but at least you have an index. 在这里,您将存储字段'child_first_1','child_last_1','child_first_2','child_last_2'等。再次由您来关联值,但至少您有一个索引。 With some code you could make this transparent.
使用一些代码,您可以使其透明。
Bottom line: as the Solr wiki says: "Solr provides one table. Storing a set database tables in an index generally requires denormalizing some of the tables. Attempts to avoid denormalizing usually fail." 底线:正如Solr wiki所说:“Solr提供了一个表。在索引中存储集合数据库表通常需要对某些表进行非规范化。尝试避免非规范化通常会失败。” It's up to you to denormalize your data according to your search needs.
您可以根据自己的搜索需求对数据进行反规范化。
UPDATE: Since version 4.5 or so Solr supports nested documents directly: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers 更新:从版本4.5开始,Solr直接支持嵌套文档: https : //cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers
Having a separate fields for children leads to false positive matches. 为儿童设置单独的字段会导致误报。 Concatenated fields works in some meaning but it's really limited approach.
连接字段在某种意义上起作用,但它确实是有限的方法。 We have a lot of experience in the similar tasks blogged at http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html
我们在http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html上发布的类似任务中有很多经验。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.