简体   繁体   English

Elasticsearch计数术语忽略大小写

[英]Elasticsearch count terms Ignore cases

Following is my aggregation. 以下是我的总结。

{
    "size": 0,
    "aggs": {
        "cities": {
            "terms": {
               "field": "city.raw"
           }
    }
}

Mapping 制图

"properties": {
    "state" : {
      "type": "string",
      "fields": {
        "raw" : {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }

Works great. 效果很好。 But it groups the fields considering the case sensitivity. 但是它考虑了区分大小写对字段进行了分组。

eg. 例如。

{
    "key": "New York",
    "doc_count": 45
},
{
    "key": "new york",
    "doc_count": 11
},
{
    "key": "NEW YORK",
    "doc_count": 44
}

I want the result to come like this 我希望结果像这样

{
    "key": "new york",
    "doc_count": 100
}

I think the problem is that u use the raw version of the indexed string 我认为问题是您使用索引字符串的原始版本

city.raw

You don't have any analyzed version of your filed ? 您没有存档的任何分析版本? It should be great if you also put the mapping of the field in the example. 如果您还将字段的映射也放在示例中,那应该很棒。

Update: U should use a custom analyzer for what you need. 更新:U应该使用自定义分析器来满足您的需求。 The tokenizer should be keyword and the filter lowercase. 标记器应为关键字,过滤器应小写。 Then index your data with this analyzer. 然后使用此分析器索引数据。 Then should work. 然后应该工作。

            "analyzer": {
                "my_analyzer": {
                    "type":         "custom",                       
                    "tokenizer":    "keyword",
                    "filter":       "lowercase"
                }   
            }

And some info KeyWord Analyzer and Custom Analyzers 以及一些信息关键字分析器自定义分析器

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM