简体   繁体   中英

Elasticsearch count terms Ignore cases

Following is my aggregation.

{
    "size": 0,
    "aggs": {
        "cities": {
            "terms": {
               "field": "city.raw"
           }
    }
}

Mapping

"properties": {
    "state" : {
      "type": "string",
      "fields": {
        "raw" : {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }

Works great. But it groups the fields considering the case sensitivity.

eg.

{
    "key": "New York",
    "doc_count": 45
},
{
    "key": "new york",
    "doc_count": 11
},
{
    "key": "NEW YORK",
    "doc_count": 44
}

I want the result to come like this

{
    "key": "new york",
    "doc_count": 100
}

I think the problem is that u use the raw version of the indexed string

city.raw

You don't have any analyzed version of your filed ? It should be great if you also put the mapping of the field in the example.

Update: U should use a custom analyzer for what you need. The tokenizer should be keyword and the filter lowercase. Then index your data with this analyzer. Then should work.

            "analyzer": {
                "my_analyzer": {
                    "type":         "custom",                       
                    "tokenizer":    "keyword",
                    "filter":       "lowercase"
                }   
            }

And some info KeyWord Analyzer and Custom Analyzers

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM