简体   繁体   English

Sunspot如何修改Solr的schema.xml? 是否会修改它?

[英]How does Sunspot modify Solr's schema.xml? Does it modify it at all?

Let me know if I am wrong, but I think solr only expects fields that are already mentioned in the schema.xml. 如果我错了,请告诉我,但我认为solr只需要schema.xml中已经提到过的字段。 So, if I have a field called 'title', I need to mention this in the schema. 所以,如果我有一个名为'title'的字段,我需要在模式中提到它。

There is no mentioning about modifying the schema.xml in the Sunspot's documentation. 在Sunspot的文档中没有提到修改schema.xml。 I am just wondering how Sunspot modifies schema.xml allowing custom fields to be entered to the index. 我只是想知道Sunspot如何修改schema.xml,允许自定义字段输入索引。

I also know that Sunspot uses RSolr to do things. 我也知道Sunspot使用RSolr来做事情。 So if there is a way to modify the schema and reload data from DB to Solr using RSolr, please let me know. 因此,如果有办法修改架构并使用RSolr将数据从DB重新加载到Solr,请告诉我。

As karmajunkie alludes to, Sunspot uses its own standard schema. 正如karmajunkie所暗示的那样,Sunspot使用自己的标准模式。 I'll go in to how that works in a bit more detail here. 我将在这里详细介绍它的工作原理。

Solr Schema 101 Solr Schema 101

For the purposes of this discussion, Solr schemas are mostly comprised of two things: type definitions, and field definitions. 出于本讨论的目的,Solr模式主要由两部分组成:类型定义和字段定义。

A type definition sets up a type by specifying its name, the Java class for the type, and in the case of some types (notably text), a subordinate block of XML configuring how that type is handled. type定义通过指定类型名称,类型的Java类以及某些类型(特别是文本)的情况来设置类型,这是XML的从属块,用于配置如何处理该类型。

A field definition allows you to define the name of a field, and the name of the type of value contained in that field. field定义允许您定义字段的名称以及该字段中包含的值类型的名称。 This allows Solr to correlate the name of a field in a document with its type, and a handful of other options, and thus how that field's value should be processed in your index. 这允许Solr将文档中字段的名称与其类型以及少数其他选项相关联,从而如何在索引中处理该字段的值。

Solr also supports a dynamicField definition, which, instead of a static field name, lets you specify a pattern with a glob in it. Solr还支持dynamicField定义,而不是静态字段名称,允许您指定其中包含glob的模式。 Incoming fields can have their names matched against these patterns in order to determine their types. 传入字段的名称可以与这些模式匹配,以确定其类型。

Sunspot's conventional schema 太阳黑子的传统图式

Sunspot's schema has a handful of field definitions for internally used fields, such as the ID and model name. Sunspot的模式有一些内部使用字段的field定义,例如ID和模型名称。 Additionally, Sunspot makes liberal use of dynamicField definitions to establish naming conventions based on types. 此外,Sunspot自由使用dynamicField定义来根据类型建立命名约定。

This use of field naming conventions allows Sunspot to define a configuration DSL that creates a mapping from your model into an XML document ready to be indexed by Solr. 这种字段命名约定的使用允许Sunspot定义一个配置DSL,用于创建从模型到准备由Solr索引的XML文档的映射。

For example, this simple configuration block in your model… 例如,模型中的这个简单配置块......

searchable do
  text :body
end

…will be used by Sunspot to create a field name of body_text . ...将由Sunspot用于创建body_text的字段名称。 This field name is matched against the *_text pattern for the following dynamicField definition in the schema: 此字段名称与架构中以下dynamicField定义的*_text模式匹配:

<dynamicField name="*_text" type="text" indexed="true" stored="false" multiValued="true"/>

This maps any field with the suffix _text to Sunspot's definition of the text type. 这会将带有后缀_text任何字段映射到Sunspot对text类型的定义。 If you take a look at Sunspot's schema.xml, you'll see many other similar conventions for other types and options. 如果您查看Sunspot的schema.xml,您将看到许多其他类型和选项的类似约定。 The :stored => true option, for example, will typically add an s on that type's suffix (eg, _texts ). 例如, :stored => true选项通常会在该类型的后缀上添加一个s (例如, _texts )。

Modifying Sunspot's schema in practice 在实践中修改太阳黑子的架构

In my experience with clients', and my own, projects, there are two good cases for modifying Sunspot's schema. 根据我对客户和我自己的项目的经验,有两个很好的案例来修改Sunspot的架构。 First, for making changes to the text field's analyzers based on the different features your application might need. 首先,根据应用程序可能需要的不同功能更改text字段的分析器。 And, second, for creating brand new types (usually based on the text type) for a more fine-grained application of Solr analyzers. 其次,为了更精细的Solr分析仪应用,创建全新的类型(通常基于文本类型)。

For example, widening search matches with "fuzzy" searches can be done with matches against a special text-based field that also uses linguistic stems, or NGrams. 例如,扩展搜索匹配与“模糊”搜索可以通过匹配特殊的基于文本的字段来完成,该字段也使用语言词干或NGrams。 The tokens in the original text field may be used to populate spellcheck, or to boost exact matches. 原始text字段中的标记可用于填充拼写检查,或用于提升完全匹配。 And the tokens in the custom text_ngram or text_en can serve to broaden search results when the stricter matching fails. 当更严格的匹配失败时,自定义text_ngramtext_en的标记可用于扩大搜索结果。

Sunspot's DSL provides one final feature for mapping your fields to these custom fields. Sunspot的DSL提供了一个最终功能,用于将字段映射到这些自定义字段。 Once you have set up the type and its corresponding dynamicField definition(s), you can use Sunspot's :as option to override the convention-based name generation. 一旦设置了type及其相应的dynamicField定义,就可以使用Sunspot的:as选项来覆盖基于约定的名称生成。

For example, adding a custom ngram type for the above, we might end up processing the body again with NGrams with the following Ruby code: 例如,为上面添加一个自定义ngram类型,我们可能会使用以下Ruby代码再次使用NGrams处理正文:

searchable do
  text :body
  text :body_ngram, :as => 'body_ngram'
end

Sunspot comes with a stock schema that's a little tuned for a sunspot integration that adheres to the principle of least surprise for the developer—for example, the stock solrconfig.xml is set to turn autocommit off, even though in production you'll want to turn this on. 太阳黑子有一个库存模式,有点调整太阳黑子集成,坚持开发人员最不惊讶的原则 - 例如,股票solrconfig.xml设置为关闭自动提交,即使在生产中你想要打开它。 The schema really has more to do with types than fields—see the link below for an example of how to create a new field type. 模式实际上与类型而不是字段有关 - 请参阅下面的链接以获取有关如何创建新字段类型的示例。 Indexing a field is trivial if it fits into one of the existing types. 如果字段符合现有类型之一,则对字段建立索引是微不足道的。 For example: 例如:

class Blog
  searchable do
     text :title
  end
end

And in the search process, you'd do something like this: 在搜索过程中,你会做这样的事情:

class BlogSearch
   def self.search(options={})
     Sunspot.search(Blog) do
       with(:title, options[:title]) if options[:title].present?
     end
   end
end

Sunspot's wiki has a lot of additional documentation. 太阳黑子的维基有很多额外的文档。 Here's an example on adding a custom type to allow ngram searching: 这是一个添加自定义类型以允许ngram搜索的示例:

https://github.com/outoftime/sunspot/wiki/Wildcard-searching-with-ngrams https://github.com/outoftime/sunspot/wiki/Wildcard-searching-with-ngrams

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM