简体   繁体   English

ElasticSearch:使用match_phrase支持模糊性的替代方法

[英]ElasticSearch: alternative of using match_phrase to support fuzziness

My documents have a 'description' field, containing between 3 to 10 sentences. 我的文档有一个“说明”字段,包含3到10个句子。

I have to support fuzziness because I can't expect to the exact same words from the user. 我必须支持模糊性,因为我不能期望用户给出完全相同的词。

On the other hand, I have to use the "match_phrase" rather than "match" because if the words are too far from each other, the document is not relevant. 另一方面,我必须使用“ match_phrase”而不是“ match”,因为如果单词彼此之间的距离太远,则该文档将不相关。

The problem is that "match_phrase" doesn't analyze the words, and as a result, it doesn't support fuzziness. 问题是“ match_phrase”不分析单词,因此不支持模糊性。 (see the last paragraph here https://www.elastic.co/guide/en/elasticsearch/guide/master/phrase-matching.html ). (请参见https://www.elastic.co/guide/en/elasticsearch/guide/master/phrase-matching.html的最后一段)。

I guess I need a creative solution here to somehow achieve these two requirements. 我想我需要一个创造性的解决方案来以某种方式实现这两个要求。 Perhaps by using other search queries. 也许通过使用其他搜索查询。

After some digging in the 'span' queries, it turns out that it is possible to achieve the two requests above by using 'span_near' with 'span_multi'. 在对“ span”查询进行一些挖掘之后,事实证明可以通过将“ span_near”与“ span_multi”配合使用来实现上述两个请求。

Here is an example of searching "hello world" in the "description" field. 这是在“描述”字段中搜索“ hello world”的示例。

{
    "span_near": {
        "clauses": [{
            "span_multi": {
                "match": {
                    "fuzzy": {
                        "description": {
                            "value": "hello"
                        }
                    }
                }
            }
        }, {
            "span_multi": {
                "match": {
                    "fuzzy": {
                        "description": {
                            "value": "world"
                        }
                    }
                }
            }
        }],
        "slop": 2,
        "in_order": false,
        "collect_payloads": false
    }
},

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM