简体   繁体   English

查询包含和不包含字符的弹性文档字段

[英]Query Elastic document field with and without characters

I have the following documents stored at my elasticsearch index ( my_index ): 我的弹性搜索索引( my_index )中存储了以下文档:

{
    "name": "111666"
},
{
    "name": "111A666"
},
{
    "name": "111B666"
}

and I want to be able to query these documents using both the exact value of the name field as well as a character-trimmed version of the value. 并且我希望能够使用name字段的确切值以及该值的字符修剪版本来查询这些文档。

Examples 例子

GET /my_index/my_type/_search
{
    "query": {
        "match": {
            "name": {
                "query": "111666"
            }
        }
    }
}

should return all of the (3) documents mentioned above. 应该返回上述所有(3)文件。

On the other hand: 另一方面:

GET /my_index/my_type/_search
{
    "query": {
        "match": {
            "name": {
                "query": "111a666"
            }
        }
    }
}

should return just one document (the one that matches exactly with the the provided value of the name field). 应该只返回一个文档(该文档与name字段提供的值完全匹配)。

I didn't find a way to configure the settings of my_index in order to support such functionality (custom search/index analyzers etc..). 我没有找到一种方法来配置my_index的设置以支持此类功能(自定义搜索/索引分析器等)。

I should mention here that I am using ElasticSearch's Java API ( QueryBuilders ) in order to implement the above-mentioned queries, so I thought of doing it the Java-way. 我在这里应该提到,我正在使用ElasticSearch的Java API( QueryBuilders )来实现上述查询,因此我想到了以Java方式进行操作。

Logic 逻辑

1) Check if the provided query-string contains a letter
2) If yes (e.g 111A666), then search for 111A666 using a standard search analyzer
3) If not (e.g 111666), then use a custom search analyzer that trims the characters of the `name` field

Questions 问题

1) Is it possible to implement this by somehow configuring how the data are stored/indexed at Elastic Search? 1)是否可以通过某种方式配置弹性搜索中数据的存储/索引方式来实现?

2) If not, is it possible to conditionally change the analyzer of a field at Runtime? 2)如果没有,是否可以在运行时有条件地更改字段的分析器? (using Java) (使用Java)

You can easily use any build in analyzer or any custom analyzer to map your document in elasticsearch. 您可以轻松地使用任何内置分析器或任何自定义分析器在elasticsearch中映射您的文档。 More information on analyzer is here 有关分析仪的更多信息在这里

The "term" query search for exact match. "term"查询搜索完全匹配。 You can find more information about exact match here (Finding Exact Values) 您可以在此处找到有关完全匹配的更多信息(查找确切值)

But you can not change a index once it created. 但是创建索引后就无法更改。 If you want to change any index, you have to create a new index and migrate all your data to new index. 如果要更改任何索引,则必须创建一个新索引并将所有数据迁移到新索引。

Your question is about different logic for the analyzer at index and query time. 您的问题是关于索引和查询时分析器的不同逻辑。

The solution for your Q1 is to generate two tokens at index time (111a666 -> [111a666, 111666]) but only on token at query time (111a666 -> 111a666 and 111666 -> 111666). Q1的解决方案是在索引时间(111a666-> [111a666,111666])生成两个令牌,但仅在查询时间(111a666-> 111a666和111666-> 111666)生成令牌。

Imho your have to generate a new analyzer like https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern_replace-tokenfilter.html which supported "preserve_original" like https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-capture-tokenfilter.html does. 恕我直言,您必须生成一个新的分析器,例如https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern_replace-tokenfilter.html ,该分析器支持"preserve_original"例如https://www.elastic。 co / guide / en / elasticsearch / reference / current / analysis-pattern-capture-tokenfilter.html可以。 Or you could use two fields (one with original and one without letters) and search over both. 或者,您可以使用两个字段(一个带有原始字段,一个不带字母)并在两个字段上进行搜索。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何对汉字进行弹性全查询 - How to full query on elastic about Chinese characters 弹性搜索:在“ query_string”中使用“完全匹配”来增强结果,而无需指定字段 - Elastic Search: Boost Results with “Exact match” in “query_string” without specifying the field 弹性搜索词查询不适用于特定字段 - Elastic search term query not working on a specific field 弹性搜索排序字段,其中包含特殊字符数字和字母缩写 - Elastic search sort field containing special characters numbers and alpahbets Elastic Search查询字符串从搜索评分中排除字段 - Elastic Search query string exclude field from search scoring 弹性搜索:“ fuzzy_like_this_field”过滤查询不起作用 - Elastic Search : “fuzzy_like_this_field” filter query is not working 弹性搜索:根据源中的_field获取所有文档ID,并使用新数据更新_field - Elastic Search: To get all the document id depending on _field in source and update the _field with new data DynamoDBTypeConverter 用于没有注释的 DynamoDB 文档字段 - DynamoDBTypeConverter for DynamoDB document field without annotations 如何在 queryDocumentSnapshot Firestore 中查询文档字段 - How to query a document field inside of queryDocumentSnapshot Firestore 通过查询未分析的文本字段来删除Lucene文档 - Remove Lucene document by query of not analyzed text field
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM