简体   繁体   English

使用 Hibernate 搜索创建和使用 LuceneAnalysisDefinitionProvider

[英]Creating and using LuceneAnalysisDefinitionProvider with Hibernate Search

When you search Stackoverflow or the Inte.net for LuceneAnalysisDefinitionProvider , you'll find hundreds of pages, each of them having the same code copied from another page without any decent explanation or further examples of usage.当您在 Stackoverflow 或 Inte.net 上搜索LuceneAnalysisDefinitionProvider ,您会发现数百个页面,每个页面都有从另一个页面复制的相同代码,但没有任何恰当的解释或进一步的用法示例。

So I tried to do it by myself and failed.所以我尝试自己做,但失败了。 Here is my code:这是我的代码:

public class CustomLuceneAnalysisDefinitionProvider
        implements LuceneAnalysisDefinitionProvider {

  @Override
  public void register(final LuceneAnalysisDefinitionRegistryBuilder builder) {
    builder
      .analyzer("customAnalyzer")
        .tokenizer(StandardTokenizerFactory.class)
        .charFilter(MappingCharFilterFactory.class)
          .param("mapping",
            "org/hibernate/search/test/analyzer/mapping-chars.properties")
        .tokenFilter(ASCIIFoldingFilterFactory.class)
        .tokenFilter(LowerCaseFilterFactory.class)
        .tokenFilter(StopFilterFactory.class)
          // WRONG! It's not "mapping"!
//        .param("mapping",
//          "org/hibernate/search/test/analyzer/stoplist.properties")
          .param("words",
            "classpath:/stoplist.properties")
          .param("ignoreCase", "true");
  }

}

Now we have CustomLuceneAnalysisDefinitionProvider and what's next?现在我们有了CustomLuceneAnalysisDefinitionProvider ,下一步是什么?

  1. Where to put and how to address mapping-chars.properties when adding it as a parameter to MappingCharFilterFactory ?MappingCharFilterFactory作为parameter添加到mapping-chars.properties时应该放在哪里以及如何处理?
  2. What is the contents of mapping-chars.properties and how to create mine of modify existing? mapping-chars.properties的内容是什么,如何创建或修改现有的?
  3. Where to put stoplist.properties and how to address it when adding as mapping parameter to StopFilterFactory ?StopFilterFactory放在哪里以及将其作为mapping参数添加到stoplist.properties时如何解决?
  4. How to add previously defined customAnalyzer to single @Field mentioned below?如何将先前定义的customAnalyzer添加到下面提到的单个@Field
@Field(
    index = Index.YES,
    analyze = Analyze.YES,
    store = Store.NO,
    bridge = @FieldBridge(impl = LocalizedFieldBridge.class)
)
private LocalizedField description;

On some pages I found option to put this definition into application.properties:在某些页面上,我找到了将此定义放入 application.properties 的选项:

hibernate.search.lucene.analysis_definition_provider = com.thevegcat.app.search.CustomAnalysisDefinitionProvider

But I don't want to replace original analyzer, I just want to use custom analyzer for few specific properties.但我不想替换原来的分析器,我只是想为一些特定的属性使用自定义分析器。


EDIT#1编辑#1

Looking into org.apache.lucene.analysis.core.StopFilterFactory line 86, one can notice it takes words as a key, not mapping .查看org.apache.lucene.analysis.core.StopFilterFactory第 86 行,可以注意到它使用words作为键,而不是mapping


EDIT#2编辑#2

If you put your stop words file in src/main/resources, then you have to address it:如果您将停用词文件放在 src/main/resources 中,那么您必须解决它:

.param("words", "classpath:/stoplist.properties")

you'll find hundreds of pages, each of them having the same code copied from another page without any decent explanation or further examples of usage.你会发现数百个页面,每个页面都有从另一个页面复制的相同代码,没有任何体面的解释或进一步的使用示例。

Hibernate Search 5 had its problems, one of which was lack of documentation in some areas. Hibernate Search 5 有它的问题,其中之一是在某些领域缺少文档。 Now that it's in maintenance mode, those problems are unlikely to get addressed.现在它处于维护模式,这些问题不太可能得到解决。

There is some documentation for that feature in the Hibernate Search 5 documentation: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#section-programmatic-analyzer-definition Hibernate Search 5 文档中有一些关于该功能的文档: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#section-programmatic-analyzer-definition

You'll get better documentation of that feature by migrating to Hibernate Search 6+.通过迁移到 Hibernate Search 6+,您将获得该功能的更好文档

That being said, most of your questions related to Lucene features, so you probably won't find answers in Hibernate Search's documentation.也就是说,您的大部分问题都与 Lucene 功能相关,因此您可能无法在 Hibernate 搜索的文档中找到答案。 You could find them in Lucene's documentation.您可以在 Lucene 的文档中找到它们。 How to find such documentation is explained in the Hibernate Search 6 documentation: Hibernate Search 6 文档中解释了如何查找此类文档:

To know more about the behavior of these character filters, tokenizers and token filters, either browse the Lucene Javadoc or read the corresponding section on the Solr Wiki (you don't need Solr to use these analyzers, it's just that there is no documentation page for Lucene proper).要了解更多关于这些字符过滤器、标记器和标记过滤器的行为,请浏览Lucene Javadoc或阅读Solr Wiki上的相应部分(您不需要 Solr 来使用这些分析器,只是没有文档页面对于 Lucene 正确)。


Where to put and how to address mapping-chars.properties when adding it as a parameter to MappingCharFilterFactory?将mapping-chars.properties作为参数添加到MappingCharFilterFactory时放在哪里以及如何解决?

In your classpath.在你的类路径中。

What is the contents of mapping-chars.properties and how to create mine of modify existing? mapping-chars.properties 的内容是什么,如何创建或修改现有的?

That's the kind of things that Lucene doesn't document, at least not clearly.这是 Lucene 没有记录的事情,至少没有记录清楚。 Solr's documentation is better: https://solr.apache.org/guide/6_6/charfilterfactories.html#CharFilterFactories-solr.MappingCharFilterFactory Solr的文档比较好: https://solr.apache.org/guide/6_6/charfilterfactories.html#CharFilterFactories-solr.MappingCharFilterFactory

Where to put stoplist.properties and how to address it when adding as mapping parameter to StopFilterFactory?将 stoplist.properties 放在哪里以及将其作为映射参数添加到 StopFilterFactory 时如何解决?

Put it in the classpath, and pass the path to that file from the root of your classpath.将它放在类路径中,然后将路径从类路径的根目录传递到该文件。

How to add previously defined customAnalyzer to single @Field mentioned below?如何将先前定义的 customAnalyzer 添加到下面提到的单个 @Field?

Well that is documented, at least: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_referencing_named_analyzers好吧至少有记录: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_referencing_named_analyzers

@Field(analyzer = @Analyzer(definition = "customAnalyzer"))

On some pages I found option to put this definition into application.properties:在某些页面上,我找到了将此定义放入 application.properties 的选项:

 hibernate.search.lucene.analysis_definition_provider = com.thevegcat.app.search.CustomAnalysisDefinitionProvider

But I don't want to replace original analyzer, I just want to use custom analyzer for few specific properties.但我不想替换原来的分析器,我只是想为一些特定的属性使用自定义分析器。

You won't replace an "analyzer", you will register an analysis definition provider.您不会替换“分析器”,您将注册一个分析定义提供程序。 Which will add analyzer definitions to Hibernate Search, which can then be referenced from @Field .这会将分析器定义添加到 Hibernate 搜索,然后可以从@Field引用它。 Setting an analysis definition provider does not, in itself, change your mapping in any way.设置分析定义提供程序本身不会以任何方式更改您的映射。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM