简体   繁体   English

带有通配符的Elasticsearch索引字段并进行搜索

[英]Elasticsearch index field with wildcard and search for it

I have a document with a field "serial number". 我有一个带有“序列号”字段的文档。 That serial number is ABC.XXX.DEF where XXX indicates wildcards. 该序列号是ABC.XXX.DEF,其中XXX表示通配符。 XXX can be \\d{3}[a-zA-Z0-9]. XXX可以是\\ d {3} [a-zA-Z0-9]。

So users can search for: 因此用户可以搜索:

ABC.123.DEF ABC.123.DEF

ABC.234.DEF ABC.234.DEF

ABC.XYZ.DEF ABC.XYZ.DEF

while the document only includes 而文档仅包含

ABC.XXX.DEF ABC.XXX.DEF

When a user queries ABC.123.DEF i need a hit on that document containing ABC.XXX.DEF. 当用户查询ABC.123.DEF时,我需要在包含ABC.XXX.DEF的文档上单击。 As other documents might contain ABC.DEF.XXX and must not be hit I am running out of ideas with my basic elasticsearch knowledge. 由于其他文档可能包含ABC.DEF.XXX,并且一定不要被单击,所以我的基本Elasticsearch知识用尽了。

Do I have to attack the problem from the query side or when analyzing/tokenizing the pattern? 在分析/标记模式时,我是否必须从查询端解决问题?

Can anyone give me an example how to approach that problem? 谁能给我一个例子来解决这个问题?

As long as serial number is well defined the first solution that comes to my mind is to split serial number into three parts ("part1", "part2" and "part3", for example) and index them as three separate fields. 只要序列号定义明确,我想到的第一个解决方案就是将序列号分为三个部分(例如,“ part1”,“ part2”和“ part3”)并将它们索引为三个单独的字段。 Parts consisting of wildcards should have special value or may not be indexed at all. 由通配符组成的部分应具有特殊值,或者完全不能索引。 Then at query time I would split serial number provided by user in the same way. 然后在查询时,我将以同样的方式拆分用户提供的序列号。 Assuming that parts consisting of wildcards are not indexed my query would look like this: 假设没有对由通配符组成的部分建立索引,我的查询将如下所示:

"query": {
  "bool": {
    "must":[
      {
        "bool": {
          "should": [
            {
              "match": {
                "part1": "ABC"
              }
            },
            {
              "bool": {
                "must_not": {
                  "exists": {
                    "field": "part1"
                  }
                }
              }
            }
          ]
        }
      },
      ... // Similar code for other parts
    ] 
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM