简体   繁体   English

按相同顺序分析和匹配所有条款

[英]Analyze and match all terms in same order

What I want to achieve: 我想要实现的目标:

document: "one two three four" 文件:“一二三四”

search strings: 搜索字符串:

  • "one four" (must match) “四分”(必须匹配)
  • "four one" (must not match) “四一”(绝不匹配)

What I've learned this far: 我到目前为止学到了什么:

For order to be accounted for, the span_near query should be used, but this assumes that the terms are already analyzed by the client (all terms must be supplied separately). 对于要考虑的顺序,应使用span_near查询,但这假设客户已经分析了这些术语(所有术语必须单独提供)。

To have the search string analyzed, the phrase_match query should be used, but it does not take order into account. 要分析搜索字符串,应使用phrase_match查询,但不考虑顺序。

It's likely a script should be used (thanks @ChintanShah25), but it seems impossible to analyse the input string inside the script. 可能应该使用脚本(感谢@ ChintanShah25),但似乎无法分析脚本中的输入字符串。

How to achieve both analysis and order requirement? 如何实现分析和订单要求?

There is no straightforward way to achieve this, you could do this with either using _analyze endpoint with span query or with script and match_phrase 没有直接的方法来实现这一点,您可以使用带有span query _analyze端点或使用scriptmatch_phrase来实现此match_phrase

1) You pass your search string to _analyze with 1)您将搜索字符串传递给_analyze with

curl -XGET 'localhost:9200/_analyze' -d '
{
  "analyzer" : "my_custom_analyzer",
  "text" : "one four"
}'

you will get something like this 你会得到这样的东西

{
   "tokens": [
      {
         "token": "one",
         "start_offset": 0,
         "end_offset": 3,
         "type": "<ALPHANUM>",
         "position": 1
      },
      {
         "token": "four",
         "start_offset": 4,
         "end_offset": 8,
         "type": "<ALPHANUM>",
         "position": 2
      }
   ]
}

you then pass the tokens to the span query 然后将标记传递给span query

{
    "span_near" : {
        "clauses" : [
            { "span_term" : { "field" : "token1" } },
            { "span_term" : { "field" : "token2" } }
        ],
        "slop" : 2,
        "in_order" : true,
        "collect_payloads" : false
    }
}

2) Another way is to use advanced scripting , have a look at the answer of @Andrei Stefan for this question , He used _POSITIONS with match_phrase to get back results with terms in order. 2)另一种方法是使用高级脚本 ,看看@Andrei斯特凡的答案这个问题 ,他用_POSITIONSmatch_phrase找回的条款结果的顺序。

Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM