[英]Elasticsearch search fails in field with special character and wildcard
I have a field in Elasticsearch with the value "PEI.H.02354.01.". 我在Elasticsearch中有一个字段,其值为“PEI.H.02354.01。”。 When I search with
querystring
as 当我用
querystring
搜索时
{
"query":{
"query_string":{
"query":"field:PEI.H.02354.01.",
"default_operator":"AND"
}
}
}
then a result is returned, which is the correct behavior. 然后返回一个结果,这是正确的行为。 But if I search with a wildcard, then no results are returned, eg
但是,如果我使用通配符搜索,则不会返回任何结果,例如
{
"query":{
"query_string":{
"query":"field:PEI.H.02354.01.*",
"default_operator":"AND"
}
}
}
The field is of type string and analyzed. 该字段是字符串类型并进行分析。 Below is the code that creates the index, including the analyzer and the mappings.
下面是创建索引的代码,包括分析器和映射。
{
"settings":{
"analysis":{
"analyzer":{
"number":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"lowercase"
],
"char_filter":[
"number_filter"
]
},
"diacritical":{
"type":"custom",
"tokenizer":"standard",
"filter":[
"standard",
"lowercase",
"asciifolding",
"nfd_normalizer"
]
}
},
"filter":{
"nfd_normalizer":{
"type":"icu_normalizer",
"name":"nfc"
}
},
"char_filter":{
"number_filter":{
"type":"pattern_replace",
"pattern":"[^\\d]+",
"replacement":""
}
}
}
},
"mappings":{
"testType":{
"_source":{
"enabled":false
},
"_all":{
"enabled":false
},
"_timestamp":{
"enabled":"true",
"store":"yes"
},
"properties":{
"field":{
"store":"yes",
"type":"string",
"index":"analyzed",
"analyzer":"diacritical"
}
}
}
}
Finally, a sample insert is 最后,插入样本
{
field: "PEI.H.02354.01."
}
Does anyone have any idea why this is happening and how to solve this? 有谁知道为什么会这样,以及如何解决这个问题?
See the query_string documentation: 请参阅query_string文档:
Wildcarded terms are not analyzed by default — they are lowercased (lowercase_expanded_terms defaults to true) but no further analysis is done 默认情况下不分析通配符 - 它们是小写的(lowercase_expanded_terms默认为true)但不进行进一步分析
your stored data is broken up into two terms: 您存储的数据分为两个术语:
curl -XGET 'localhost:9200/myindex/_analyze?analyzer=diacritical&pretty' -d 'PEI.H.02354.01'
{
"tokens" : [ {
"token" : "pei.h",
"start_offset" : 0,
"end_offset" : 5,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "02354.01",
"start_offset" : 6,
"end_offset" : 14,
"type" : "<NUM>",
"position" : 2
} ]
}
but as your search term with a wildcard is only turned into pei.h.02354.01.*
it won't match. 但是因为带有通配符的搜索词只能变成
pei.h.02354.01.*
它将不匹配。
however with analyze_wildcard
set to true , you do get hits: 但是,如果
analyze_wildcard
设置为true ,则会获得命中:
curl -XGET "http://localhost:9200/myindex/testType/_search?pretty" -d'
> {
> "query":{
> "query_string":{
> "query":"field:PEI.H.02354.01.*",
> "default_operator":"AND",
> "analyze_wildcard": true
> }
> }
> }'
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.4142135,
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.