简体   繁体   English

elasticsearch中的多字段文本和关键字字段

[英]Multi field text and keyword fields in elasticsearch

I'm looking into switching from solr to elasticsearch and have indexed a bunch of documents into it without providing a schema/mapping and a lot of the fields that i would have previously set as indexed strings in solr have been set as both text and keyword fields using multi-fields . 我正在考虑从solr切换到elasticsearch并将一堆文档编入其中而不提供模式/映射,并且我之前在solr中设置为索引字符串的许多字段已被设置为textkeyword使用多字段的字段

Is there any benifit to having a keyword field also as a text field using multi-fields ? 关键字字段作为使用多字段文本字段有什么好处吗? in my case most values in fields are single words so i'd imagine it wouldn't matter if they are sent to the analyzer but the es docs seem to imply that keyword fields are not considered when searching or at least treated differently? 在我的情况下,字段中的大多数值都是单个单词,所以我想如果将它们发送到分析器并不重要但是es文档似乎暗示在搜索时不考虑关键字字段或者至少采用不同的处理方式?

Just to expand on that a little further if i search for the term "ipad" would a document score higher if it had "ipad" in a keyword field as well as some other text field vs the same document without the keyword field? 只是为了进一步扩展,如果我搜索术语“ipad”,如果在关键字字段中有“ipad”以及其他文本字段与没有关键字字段的同一文档,文档得分会更高吗? and if say "ipad" was only in a keyword field would the document still match? 如果说“ipad”仅在关键字字段中,那么文档是否仍然匹配?

To answer my own question i created a quick test, pretty much keyword and text fields are equivalent when searching and multi-fields seem to get the same score as their primary type so i guess the second field has no effect on search scoring 为了回答我自己的问题,我创建了一个快速测试,几乎关键字和文本字段在搜索时是等效的,多字段似乎得到与其主要类型相同的分数,所以我猜第二个字段对搜索评分没有影响

Weirdly a multi word value in both keyword and text fields got the same score which i would have expecting the keyword field to score lower or not at all but for my purposes that is fine so i'm not going to investigate it further. 奇怪的是,关键字和文本字段中的多字值得到了相同的分数,我希望关键字字段得分较低或根本没有,但为了我的目的,这很好,所以我不打算进一步调查。

Index Creation 索引创建

PUT test_index
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "test_type" : {
            "properties" : {
                "multifield": {
                  "type": "text",
                  "fields": {
                     "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                     }
                  }
                },

                "keywordfield": {
                  "type": "keyword"
                },

                "textfield": {
                  "type": "text"
                }

            }
        }
    }
}

Data Insert 数据插入

POST /_bulk
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 1 }
{ "doc" : { "multifield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 2 }
{ "doc" : { "keywordfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 3 }
{ "doc" : { "keywordfield" : "a green ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 4 }
{ "doc" : { "textfield" : "a yellow ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 5 }
{ "doc" : { "keywordfield" : "ipad", "textfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 6 }
{ "doc" : { "keywordfield" : "unrelated", "textfield" : "hopefully this wont show up"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 7 }
{ "doc" : { "textfield" : "ipad"  }, "doc_as_upsert" : true }

Results 结果

GET /test_index/_search?q=ipad
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 6,
      "max_score": 0.28122374,
      "hits": [
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "5",
            "_score": 0.28122374,
            "_source": {
               "keywordfield": "ipad",
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "1",
            "_score": 0.2734406,
            "_source": {
               "multifield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "2",
            "_score": 0.2734406,
            "_source": {
               "keywordfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "7",
            "_score": 0.2734406,
            "_source": {
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "3",
            "_score": 0.16417998,
            "_source": {
               "keywordfield": "a green ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "4",
            "_score": 0.16417998,
            "_source": {
               "textfield": "a yellow ipad"
            }
         }
      ]
   }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何修复关键字字段的ElasticSearch“默认情况下在文本字段上禁用字段数据” - How to fix ElasticSearch ‘Fielddata is disabled on text fields by default’ for keyword field Elasticsearch 按文本字段关键字排序 - Elasticsearch sort by text field keyword Elasticsearch-突出显示“ .keyword”和文本字段 - Elasticsearch- highlighting on both “.keyword” and text fields 在Elasticsearch中,带有“关键字”分析器的“文本”字段与“关键字”字段之间有什么区别? - What is the difference between a `text` field with a `keyword` analyzer and a `keyword field in Elasticsearch? 在 Elasticsearch 上将字段类型从文本迁移到关键字 - Migrate field type from text to keyword on Elasticsearch 是否可以在elasticsearch的同一字段上声明“文本”和“关键字”? - Is it possible to declare "text" and "keyword" on the same field in elasticsearch? 关键字ot文本字段中的Elasticsearch 5方差 - Elasticsearch 5 variance in keyword ot text field Grafana无法在String字段上聚合,因为它无法识别Elasticsearch中的关键字字段 - Grafana cannot aggregate on String fields as it does not recognize keyword field in Elasticsearch 在Elasticsearch中标记多语言文本字段 - Tokenizing a multi-language text field in Elasticsearch spring 数据 elasticsearch 如何使用子关键字字段创建文本字段 - spring data elasticsearch how to create text field with child keyword field
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM