[英]Performance issue with multivalued field in Lucene
We're using Lucene 4.7 to build and query a rather large data set (110+ millions documents). 我们正在使用Lucene 4.7构建和查询相当大的数据集(110多个百万文档)。
One of the document field, which we used for faceting, is defined as follow: 我们用于刻面的document字段之一定义如下:
<field name="topic_paths"
type="string"
indexed="false"
stored="false"
docValues="true"
multiValued="true"
termVectors="false"
termPositions="false"
termOffsets="false"/>
Whenever we include this field in queries, they become extremely slow: about 7 seconds per topic_path
value included in the search, so about 30 seconds for four topic_path
values (typical in our case). 每当我们在查询中包含此字段时,它们就会变得非常缓慢:搜索中包含的每个topic_path
值大约需要7秒,因此四个topic_path
值大约需要30秒(在我们的示例中为典型值)。
Queries that don't use this field are very fast (15 ms). 不使用该字段的查询非常快(15毫秒)。
Is this performance we should expect from Lucene with multi-valued fields used for faceting? 我们应该期望Lucene具有用于多面值的多值字段的性能吗? Is there anything wrong or suboptimal with our field definition? 我们的字段定义有什么错误或不理想吗? Are there tricks we could use to speedup searches? 我们可以使用一些技巧来加快搜索速度吗?
Details: 细节:
Read this article, http://wiki.apache.org/solr/SchemaXml#Fields 阅读本文, http://wiki.apache.org/solr/SchemaXml#Fields
You need to "index" you field for including it into search/faceting, otherwise Solr will skipping this field without any exception 您需要为您的字段“索引”以将其包括在搜索/方面中,否则Solr将毫无例外地跳过此字段
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.