简体   繁体   中英

ElasticSearch an edgeNGram for autocomplete\typeahead, is my search_analyzer being ignored

I've got three documents with a "userName" field:

  • 'briandilley'
  • 'briangumble'
  • 'briangriffen'

when i search for 'brian' i get all three back as expected, but when i search for 'briandilley' i still get all three back. The analyze API is telling me that it's using the ngram filter on my search string, but i'm not sure why. here's my setup:

index settings:

{
    "analysis": {
        "analyzer": {
            "username_index": {
                "tokenizer": "keyword",
                "filter": ["lowercase", "username_ngram"]
            },
            "username_search": {
                "tokenizer": "keyword",
                "filter": ["lowercase"]
            }
        },
        "filter": {
            "username_ngram": {
                "type": "edgeNGram",
                "side" : "front",
                "min_gram": 1,
                "max_gram": 15
            }
        }
    }
}

mapping:

{
    "user_follow": {

        "properties": {
            "targetId": { "type": "string", "store": true },
            "followerId": { "type": "string", "store": true },
            "dateUpdated": { "type": "date", "store": true },

            "userName": {
                "type": "multi_field",
                "fields": {
                    "userName": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "autocomplete": {
                        "type": "string",
                        "index_analyzer": "username_index",
                        "search_analyzer": "username_search"
                    }
                }
            }
        }
    }
}

search:

{
    "from" : 0,
    "size" : 50,
    "query" : {
        "bool" : {
            "must" : [ {
                "field" : {
                    "targetId" : "51888c1b04a6a214e26a4009"
                }
            }, {
                "match" : {
                    "userName.autocomplete" : {
                        "query" : "brian",
                        "type" : "boolean"
                    }
                }
            } ]
        }
    },
    "fields" : "followerId"
}

I've tried matchQuery, matchPhraseQuery, textQuery and termQuery (java DSL api) and i get the same results every time.

I think that you're not doing exactly what you think you're doing. This is why it is useful to present an actual test case with full curl statements, rather than abbreviating it.

Your example above works for me (slightly modified):

Create the index with settings and mapping:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'  -d '
{
  "mappings" : {
     "test" : {
        "properties" : {
           "userName" : {
              "fields" : {
                 "autocomplete" : {
                    "search_analyzer" : "username_search",
                    "index_analyzer" : "username_index",
                    "type" : "string"
                 },
                 "userName" : {
                    "index" : "not_analyzed",
                    "type" : "string"
                 }
              },
              "type" : "multi_field"
           }
        }
     }
  },
  "settings" : {
     "analysis" : {
        "filter" : {
           "username_ngram" : {
              "max_gram" : 15,
              "min_gram" : 1,
              "type" : "edge_ngram"
           }
        },
        "analyzer" : {
           "username_index" : {
              "filter" : [
                 "lowercase",
                 "username_ngram"
              ],
              "tokenizer" : "keyword"
           },
           "username_search" : {
              "filter" : [
                 "lowercase"
              ],
              "tokenizer" : "keyword"
           }
        }
     }
  }
}
'

Index some data:

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '{
  "userName" : "briangriffen"
}
'

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '
{
  "userName" : "brianlilley"
}
'

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '
{
  "userName" : "briangumble"
}
'

A search for brian finds all documents:

curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1'  -d '{
  "query" : {
     "match" : {
        "userName.autocomplete" : "brian"
     }
  }
}
'

# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "userName" : "briangriffen"
#             },
#             "_score" : 0.1486337,
#             "_index" : "test",
#             "_id" : "AWzezvEFRIykOAr75QbtcQ",
#             "_type" : "test"
#          },
#          {
#             "_source" : {
#                "userName" : "briangumble"
#             },
#             "_score" : 0.1486337,
#             "_index" : "test",
#             "_id" : "qIABuMOiTyuxLOiFOzcURg",
#             "_type" : "test"
#          },
#          {
#             "_source" : {
#                "userName" : "brianlilley"
#             },
#             "_score" : 0.076713204,
#             "_index" : "test",
#             "_id" : "fGgTITKvR6GJXI_cqA4Vzg",
#             "_type" : "test"
#          }
#       ],
#       "max_score" : 0.1486337,
#       "total" : 3
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "took" : 8
# }

A search for brianlilley finds just that document:

curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1'  -d '
{
  "query" : {
     "match" : {
        "userName.autocomplete" : "brianlilley"
     }
  }
}
'

# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "userName" : "brianlilley"
#             },
#             "_score" : 0.076713204,
#             "_index" : "test",
#             "_id" : "fGgTITKvR6GJXI_cqA4Vzg",
#             "_type" : "test"
#          }
#       ],
#       "max_score" : 0.076713204,
#       "total" : 1
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "took" : 4
# }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM