I have the following ElasticSearch query which I would think would return all matches on the email field where it equals myemails@email.com
"query": {
"bool": {
"must": [
{
"match": {
"email": "myemail@gmail.com"
}
}
]
}
}
The mapping for the user type that is being searched is the following:
{
"users": {
"mappings": {
"user": {
"properties": {
"email": {
"type": "string"
},
"name": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"nickname": {
"type": "string"
},
}
}
}
}
}
The following is a sample of results returned from ElasticSearch
[{
"_index": "users",
"_type": "user",
"_id": "54b19c417dcc4fe40d728e2c",
"_score": 0.23983537,
"_source": {
"email": "johnsmith@gmail.com",
"name": "John Smith",
"nickname": "jsmith",
},
{
"_index": "users",
"_type": "user",
"_id": "9c417dcc4fe40d728e2c54b1",
"_score": 0.23983537,
"_source": {
"email": "myemail@gmail.com",
"name": "Walter White",
"nickname": "wwhite",
},
{
"_index": "users",
"_type": "user",
"_id": "4fe40d728e2c54b19c417dcc",
"_score": 0.23983537,
"_source": {
"email": "JimmyFallon@gmail.com",
"name": "Jimmy Fallon",
"nickname": "jfallon",
}]
From the above query, I would think this would need to have an exact match with 'myemail@gmail.com' as the email property value.
How does the ElasticSearch DSL query need to change in order to only return exact matches on email .
The email field got tokenized , which is the reason for this anomaly. So what happened is when you indexed
"myemail@gmail.com" => [ "myemail" , "gmail.com" ]
This way if you search for myemail OR gmail.com you will get the match right. SO what happens is , when you search for john@gmail.com , the analyzer is also applied on search query. Hence its gets broken into
"john@gmail.com" => [ "john" , "gmail.com" ]
here as "gmail.com" token is common in search term and indexed term , you will get a match.
To over ride this behavior , declare the email; field as not_analyzed. There by the tokenization wont happen and the entire string will get indexed as such.
With "not_analyzed"
"john@gmail.com" => [ "john@gmail.com" ]
So modify the mapping to this and you should be good -
{
"users": {
"mappings": {
"user": {
"properties": {
"email": {
"type": "string",
"index": "not_analyzed"
},
"name": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"nickname": {
"type": "string"
}
}
}
}
}
}
I have described the problem more precisely and another approach to solve it here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.