[英]Query Elastic document field with and without characters
I have the following documents stored at my elasticsearch index ( my_index
): 我的弹性搜索索引(
my_index
)中存储了以下文档:
{
"name": "111666"
},
{
"name": "111A666"
},
{
"name": "111B666"
}
and I want to be able to query these documents using both the exact value of the name
field as well as a character-trimmed version of the value. 并且我希望能够使用
name
字段的确切值以及该值的字符修剪版本来查询这些文档。
Examples 例子
GET /my_index/my_type/_search
{
"query": {
"match": {
"name": {
"query": "111666"
}
}
}
}
should return all of the (3) documents mentioned above. 应该返回上述所有(3)文件。
On the other hand: 另一方面:
GET /my_index/my_type/_search
{
"query": {
"match": {
"name": {
"query": "111a666"
}
}
}
}
should return just one document (the one that matches exactly with the the provided value of the name
field). 应该只返回一个文档(该文档与
name
字段提供的值完全匹配)。
I didn't find a way to configure the settings of my_index
in order to support such functionality (custom search/index analyzers etc..). 我没有找到一种方法来配置
my_index
的设置以支持此类功能(自定义搜索/索引分析器等)。
I should mention here that I am using ElasticSearch's Java API ( QueryBuilders
) in order to implement the above-mentioned queries, so I thought of doing it the Java-way. 我在这里应该提到,我正在使用ElasticSearch的Java API(
QueryBuilders
)来实现上述查询,因此我想到了以Java方式进行操作。
Logic 逻辑
1) Check if the provided query-string contains a letter
2) If yes (e.g 111A666), then search for 111A666 using a standard search analyzer
3) If not (e.g 111666), then use a custom search analyzer that trims the characters of the `name` field
Questions 问题
1) Is it possible to implement this by somehow configuring how the data are stored/indexed at Elastic Search? 1)是否可以通过某种方式配置弹性搜索中数据的存储/索引方式来实现?
2) If not, is it possible to conditionally change the analyzer of a field at Runtime? 2)如果没有,是否可以在运行时有条件地更改字段的分析器? (using Java)
(使用Java)
You can easily use any build in analyzer or any custom analyzer to map your document in elasticsearch. 您可以轻松地使用任何内置分析器或任何自定义分析器在elasticsearch中映射您的文档。 More information on analyzer is here
有关分析仪的更多信息在这里
The "term"
query search for exact match. "term"
查询搜索完全匹配。 You can find more information about exact match here (Finding Exact Values) 您可以在此处找到有关完全匹配的更多信息(查找确切值)
But you can not change a index once it created. 但是创建索引后就无法更改。 If you want to change any index, you have to create a new index and migrate all your data to new index.
如果要更改任何索引,则必须创建一个新索引并将所有数据迁移到新索引。
Your question is about different logic for the analyzer at index and query time. 您的问题是关于索引和查询时分析器的不同逻辑。
The solution for your Q1 is to generate two tokens at index time (111a666 -> [111a666, 111666]) but only on token at query time (111a666 -> 111a666 and 111666 -> 111666). Q1的解决方案是在索引时间(111a666-> [111a666,111666])生成两个令牌,但仅在查询时间(111a666-> 111a666和111666-> 111666)生成令牌。
Imho your have to generate a new analyzer like https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern_replace-tokenfilter.html which supported "preserve_original"
like https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-capture-tokenfilter.html does. 恕我直言,您必须生成一个新的分析器,例如https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern_replace-tokenfilter.html ,该分析器支持
"preserve_original"
例如https://www.elastic。 co / guide / en / elasticsearch / reference / current / analysis-pattern-capture-tokenfilter.html可以。 Or you could use two fields (one with original and one without letters) and search over both. 或者,您可以使用两个字段(一个带有原始字段,一个不带字母)并在两个字段上进行搜索。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.