简体   繁体   English

elasticsearch - 使用涉及空间的正则表达式进行搜索

[英]elasticsearch - search with regex involving space

I want to perform searching using regular expression involving whitespace in elasticsearch.我想在 elasticsearch 中使用涉及空格的正则表达式执行搜索。 I have already set my field to not_analyzed.我已经将我的字段设置为 not_analyzed。 And it's mapping is just like它的映射就像

"type1": {
   "properties": {
      "field1": {
         "type": "string",
         "index": "not_analyzed",
         "store": true
      }
   }
}

And I input two value for test,我输入两个值进行测试,

"field1":"XXX YYY ZZZ"
"field1":"XXX ZZZ YYY"

And i do some case using regex query /XXX YYY/我做了一些使用正则表达式查询 /XXX YYY/
(I want to use this query to find record1 but not record2) (我想用这个查询来查找 record1 而不是 record2)

{
    "query": {
        "query_string": {
           "query": "/XXX YYY/"
        }
    }
}

But it return 0 results.但它返回 0 结果。

However if I search without using regex (without the forward slash '/'), both record1 and record2 are returned.但是,如果我在不使用正则表达式(没有正斜杠“/”)的情况下进行搜索,则会返回 record1 和 record2。

Is that in elasticsearch, i cannot search using regex query involving space?那是在elasticsearch中,我无法使用涉及空间的正则表达式查询进行搜索吗?

What you need is a ''term'' query that doesn't tokenise the search query by breaking it down into smaller parts.您需要的是一个“术语”查询,它不会通过将搜索查询分解成更小的部分来标记搜索查询。 More about the term query here: https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-term-query.html有关术语查询的更多信息: https : //www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-term-query.html

There's a special breed of term queries that allows you to use regexes called regexp queries.有一种特殊的术语查询允许您使用称为正则表达式查询的正则表达式。 That should match any whitespaces as well: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html这也应该匹配任何空格: https : //www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html

You can keep using your query string , but your regexp is just missing a tiny part, ie the .* at the end.您可以继续使用您的query string ,但您的正则表达式只是缺少一小部分,即末尾的.* If you run that you'll get the single result you expect.如果你运行它,你会得到你期望的单一结果。

{
    "query": {
        "query_string": {
           "query": "/XXX YYY.*/"
        }
    }
}

You can use regexp queries to achieve this.您可以使用正则regexp查询来实现这一点。 Mind you, the query performance may be slow.请注意,查询性能可能会很慢。 The below query will search for all documents in which the value of field1 contains "XXX YYY".下面的查询将搜索field1的值包含“XXX YYY”的所有文档。

POST <index_name>/type1/_search
{
   "query": {
      "regexp": {
         "field1": ".*XXX YYY.*"
      }
   }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM