简体   繁体   English

Lucene多值字段-将TextField与StringField混合

[英]Lucene multiplevalue field - mixing TextField with StringField

I query my index using one multivalued field. 我使用一个多值字段查询索引。 During indexing the field is created using few fields, some of them are TextField and some are StringField . 在建立索引期间,使用几个字段创建该字段,其中一些是TextField ,一些是StringField

The problem I had was that by querying the index using query with more that one word I have got: java.lang.IllegalStateException: field "someField" was indexed without position data; 我遇到的问题是,通过使用具有多个单词的查询来查询索引: java.lang.IllegalStateException:字段“ someField”被索引而没有位置数据; cannot run PhraseQuery (term=someTerm) 无法运行PhraseQuery(term = someTerm)

I change the way the multivalued field is created, only TextField are used and the problem disapeard. 我更改了创建多值字段的方式,仅使用了TextField ,而问题却消失了。

It seems that TextField and StringField should not be mixed in one multivalued field. 似乎TextFieldStringField不应混合在一个多值字段中。 Am I right? 我对吗? Could anybody give me some explanation why (or why not)? 谁能给我一些解释为什么(或为什么不这样)?

StringField is explicitly set to store only docs ( IndexOptions.DOCS_ONLY ), which omits frequencies and positions from being stored. StringField显式设置为仅存储docs( IndexOptions.DOCS_ONLY ),该文档省略了存储的频率和位置。 Since it is effectively a keyword field, and multiple words will be stored as a single token, then running a phrase query against it doesn't really make sense. 由于它实际上是一个关键字字段,并且多个单词将作为一个令牌存储,因此对其进行短语查询实际上没有任何意义。

While it is certainly possible to mix different field types into the same field, this seems to invite confusion and unpredictable results to me. 虽然可以将不同的字段类型混合到同一字段中,但这似乎给我带来了困惑和不可预测的结果。 I would recommend being consistent about the types added to a particular field, and if you need values added with significantly different logic governing them, like the differences between TextField vs. StringField, it would probably be a much better idea to place them in different fields in the index. 我建议对添加到特定字段中的类型保持一致,并且如果您需要添加带有显着不同的逻辑来控制它们的值,例如TextField与StringField之间的差异,那么将它们放在不同的字段中可能是一个更好的主意在索引中。

If this is happening in some sort of catch-all, convenience field (like an the text field from this SOLR example ), then using a TextField for anything is probably a reasonable idea. 如果这是在某种通用的便利字段中发生的(例如此SOLR示例中text字段),则对任何内容使用TextField可能是一个合理的想法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM