简体繁体 English

在 Lucene 中，ANALYZED 和 ANALYZED_NO_NORMS 有什么区别？

[英]In Lucene, what is the difference between ANALYZED and ANALYZED_NO_NORMS?

原文 2011-07-22 11:28:17 0 1 java/ lucene/ indexing

I could not understand the difference between two ways of indexing: ANALYZED and ANALYZED_NO_NORMS .我无法理解两种索引方式之间的区别： ANALYZED和ANALYZED_NO_NORMS 。 I read the Lucene Javadoc but did not understand the difference.我阅读了 Lucene Javadoc，但不明白其中的区别。

Can someone tell me more about NORMS?有人可以告诉我更多关于 NORMS 的信息吗？ What are the benefits or limitations that they bring to indexing?它们为索引带来什么好处或限制？

1 个解决方案

ANALYZED已分析

Index the tokens produced by running the field's value through an Analyzer.索引通过分析器运行字段值产生的标记。 This is useful for common text.这对于普通文本很有用。 An analyzer might be something like a Snowball Stemmer Analyzer:分析器可能类似于 Snowball Stemmer Analyzer：

http://e-mats.org/2009/05/modifying-a-lucene-snowball-stemmer/ http://e-mats.org/2009/05/modifying-a-lucene-snowball-stemmer/

ANALYZED_NO_NORMS ANALYZED_NO_NORMS

Uses an analyzer, however it doesn't create norms for fields.使用分析器，但它不会为字段创建规范。

http://lucene.apache.org/java/2_4_0/scoring.html http://lucene.apache.org/java/2_4_0/scoring.html

Norms are created for quick scoring of documents at query time.创建规范是为了在查询时对文档进行快速评分。 These norms are usually all loaded into memory so that when you run a query analyzer over an index it can quickly score the search results.这些规范通常都加载到 memory 中，这样当您在索引上运行查询分析器时，它可以快速对搜索结果进行评分。

No norms means that index-time field and document boosting and field length normalization are disabled.没有规范意味着索引时间字段和文档提升以及字段长度规范化被禁用。 The benefit is less memory usage as norms take up one byte of RAM per indexed field for every document in the index, during searching.好处是减少了 memory 的使用，因为在搜索期间，对于索引中的每个文档，规范占用每个索引字段的一个字节 RAM。