简体   繁体   中英

char filter pattern replace doesn't work elasticsearch

I'm trying since hours to find out why this simply example doesn't work. I reduced the regex to simple example, because they don't work at all.

{
    "settings" : {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "index": {
        "analysis": {
            "char_filter" : {
                "my_pattern" :{
                    "type": "pattern_replace",
                    "pattern": "a",
                    "replacement": "u"
                }
            },
            "analyser": {
                "my_analyser": {
                    "type": "custom",
                    "tokenizer": "whitespace",
                    "char_filter": ["my_pattern"]
                    }
                }
            }
        }
    },
    "mappings" : {
        "my_type" : {
            "_source": {
                "enabled": true
            }
        }
    },
    "properties": {
        "test": {
            "type": "string",
            "store": true,
            "index": "analysed",
            "analyser": "my_analyser",
            "index_options": "positions"
        }
    }
}'

Thank you for your help

I indexed one word: "hang"

$ curl -XGET 'http://localhost:9200/tm_de_fr/my_type/_search?q=hang&pretty=true'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.30685282,
    "hits" : [ {
      "_index" : "tm_de_fr",
      "_type" : "my_type",
      "_id" : "-DWWF4kPR7S2YwZeyIsdVQ",
      "_score" : 0.30685282,
      "_source":{ "test": "hang" }
    } ]
  }
}

and

$ curl -XGET 'http://localhost:9200/tm_de_fr/my_type/_search?q=hung&pretty=true'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

I'm not sure if the _source is going to change as well, but neither the indexed data nor the _source has changed. I expected "hang" to be "hung".

$ curl -XGET 'http://localhost:9200/tm_de_fr/my_type/-DWWF4kPR7S2YwZeyIsdVQ?pretty=true'
{
  "_index" : "tm_de_fr",
  "_type" : "my_type",
  "_id" : "-DWWF4kPR7S2YwZeyIsdVQ",
  "_version" : 1,
  "found" : true,
  "_source":{ "test": "hang" }
}

Your mapping is incorrect and you need to use the american spelling of analyzer:

{
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0,
        "index": {
            "analysis": {
                "char_filter": {
                    "my_pattern": {
                        "type": "pattern_replace",
                        "pattern": "a",
                        "replacement": "u"
                    }
                },
                "analyzer": {
                    "my_analyzer": {
                        "tokenizer": "standard",
                         "char_filter": [
                            "my_pattern"
                        ]
                    }
                }
            }
        }
    },
    "mappings": {
        "my_type": {
            "properties": {
                "test": {
                    "type": "string",
                    "analyzer": "my_analyzer",
                    "index_options": "positions"
                }
            }
        }
    }
}

using the analyze API:

 curl -XGET 'localhost:9200/test/_analyze?analyzer=my_analyzer&pretty=true' -d 'aaaa'

returns:

{
    "tokens" : [ {
        "token" : "uuuu",
        "start_offset" : 0,
        "end_offset" : 4,
        "type" : "<ALPHANUM>",
        "position" : 1
    } ]
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM