简体   繁体   English

SOLR托管模式,如何使用?

[英]SOLR managed-schema, how to use it?

I got my SOLR to work, and it works decently but i have no clue what exactly managed-schema is, since I did use the default version in which i added few lines that i needed for my case 我使我的SOLR正常工作,并且工作正常,但是我不知道到底是什么托管模式,因为我确实使用了默认版本,在该版本中我添加了几行需要的案例

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="name" type="text_general" indexed="true" stored="true" default="" />
<field name="brand_id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="brand_name" type="text_general" indexed="true" stored="true" default="" />
<field name="type" type="string" indexed="true" stored="true" required="true" default="0"  />

I cannot include the full file because is like 700 lines... but full xml is here http://pastebin.com/Z9nc36QD 我无法包含完整的文件,因为大约700行...但是完整的xml在这里http://pastebin.com/Z9nc36QD

do i have to keep everything as the default example? 我是否必须保留所有内容作为默认示例? i have no clue... do you have an example of a typical schema file? 我不知道...您有一个典型模式文件的示例吗?

The Managed Schema is supposed to be manipulated through the Schema API and not by editing the files present (which include a warning about doing so). 托管模式应该通过Schema API进行操作,而不是通过编辑存在的文件(包括有关这样做的警告)进行操作。 The schema.xml file is only read once at the first time of startup to create the initial schema, any changes after that has to be done through the Schema API. 在第一次启动以创建初始架构时,只会读取schema.xml文件一次,此后的任何更改都必须通过Schema API完成。

If you want to use a schema.xml file like the older Solr versions does without any Schema API support, you can use the ClassicIndexSchemaFactory in your solrconfig.xml file. 如果要像旧的Solr版本一样使用schema.xml文件,而没有任何Schema API支持,则可以在solrconfig.xml文件中使用ClassicIndexSchemaFactory See the Schema Factory Definition : 请参阅架构工厂定义

<schemaFactory class="ClassicIndexSchemaFactory"/>

An alternative to using a managed schema is to explicitly configure a ClassicIndexSchemaFactory. 使用托管模式的另一种方法是显式配置ClassicIndexSchemaFactory。 ClassicIndexSchemaFactory requires the use of a schema.xml configuration file, and disallows any programatic changes to the Schema at run time. ClassicIndexSchemaFactory需要使用schema.xml配置文件,并且不允许在运行时对Schema进行任何编程更改。 The schema.xml file must be edited manually and is only loaded only when the collection is loaded. 必须手动编辑schema.xml文件,并且仅在加载集合时才加载。

You only need to keep the parts of the schema that you actually use, and the example schema (depending on which schema a user starts out with) will usually have many, many fields and fieldtypes that you don't need. 您只需要保留实际使用的架构部分,示例架构(取决于用户开始使用的架构)通常将具有许多不需要的字段和字段类型。 These can be removed until they are needed, and the field types can be tweaked to enable the features that you want. 可以删除它们,直到需要它们为止,并且可以对字段类型进行调整以启用所需的功能。

Do however remember that a change to the schema will require the content to be reindexed, so that the changes will be visible when searching. 但是请记住,对架构的更改将要求内容重新索引,以便在搜索时可以看到更改。

Exact schema design is something you'll have to work with and experiment with, so that you're able to get the query profile and features for matching that you need. 精确的模式设计是您必须使用和尝试的东西,这样您就可以获取查询配置文件和所需的匹配功能。

You are supposed to use Solr's Schema API. 您应该使用Solr的Schema API。 More information can be found here: https://lucene.apache.org/solr/guide/7_2/schema-api.html 可以在这里找到更多信息: https : //lucene.apache.org/solr/guide/7_2/schema-api.html

It basically means you issue curl -X POST (to localhost) from a shell to edit the file. 这基本上意味着您从外壳发出curl -X POST (到localhost)来编辑文件。

Example: 例:

:curl -X POST -H 'Content-type:application/json' --data-binary '{
 "add-field-type" : {
 "name":"myNewTxtField",
 "class":"solr.TextField",
 "positionIncrementGap":"100",
 "analyzer" : {
    "charFilters":[{
       "class":"solr.PatternReplaceCharFilterFactory",
       "replacement":"$1$1",
       "pattern":"([a-zA-Z])\\\\1+" }],
    "tokenizer":{
       "class":"solr.WhitespaceTokenizerFactory" },
    "filters":[{
       "class":"solr.WordDelimiterFilterFactory",
       "preserveOriginal":"0" }]}}
}' http://localhost:8983/solr/gettingstarted/schema`

Personal commentary 个人评论

It's 2018, there really should just be a web interface from their existing admin console to build and issue these localhost commands. 在2018年,确实应该只有他们现有的管理控制台中的Web界面才能构建和发布这些localhost命令。 I get that things can become tricky if there's a zookeeper, but basic exploration on a single server should be trivial and currently it is not . 我知道如果有一个动物园管理员,事情可能会变得棘手,但是在单个服务器上进行基本探索应该是微不足道的,而目前还不是 This approach would show the formatted curl command so it would train new developers on proper usage. 这种方法将显示格式化的curl命令,以便对新开发人员进行正确用法的培训。

Developers have to translate the xml from documentation like this into correct json for the POST. 开发人员必须将此类文档中的xml转换为POST的正确json。

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> 
  <analyzer type="index"> 
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true"
        words="stopwords.txt" />
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory"
      synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" 
      ignoreCase="true" words="stopwords.txt" />
    <filter class="solr.SynonymFilterFactory" 
      synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM