简体   繁体   English

弹性搜索match_pharse查询无法正常工作

[英]Elastic search match_pharse query is not working properly

Am trying to search the below document using match_phrase query in kibana but am not getting the response. 我正在尝试使用match_phrase查询来搜索以下文档,但没有得到响应。

Please find the document below which is availabe in elastic search 请找到下面的弹性搜索文档

    {  
       "took":7,
       "timed_out":false,
       "_shards":{  
          "total":5,
          "successful":5,
          "skipped":0,
          "failed":0
       },
       "hits":{  
          "total":2910,
          "max_score":1.0,
          "hits":[  
             {  
                "_index":"documents",
                "_type":"doc",
                "_id":"DmLD22MBFTg0XFZppYt8",
                "_score":1.0,
                "_source":{  
                   "doct_country":"DE",
                   "filename":"series_Accessories_v1_de-DE.pdf",


             }

          ]
       }
    }

Please find the query which am using to search this above document. 请找到用于搜索以上文档的查询。

GET documents/_search
{
    "query": {
        "match_phrase" : {
            "message" : "Accessories_v1_de-DE.pdf"
        }
    }
}

For the above query am getting this response : 对于上面的查询正在得到此响应:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

There are two issues. 有两个问题。 Presumably in your query you mean to use the filename field rather than message which is not present in you example document: 大概在查询中,您的意思是使用filename段而不是示例文档中不存在的message

GET documents/_search
{
    "query": {
        "match_phrase" : {
            "filename" : "Accessories_v1_de-DE.pdf"
        }
    }
}

Second, you need Elasticsearch to know that the filename field should be indexed with _ treated as a split. 其次,您需要Elasticsearch知道应使用_视为拆分将filename段索引。 This does not happen by default. 默认情况下不会发生这种情况。 One way to do this is to define your mapping as follows: 一种方法是按如下方式定义映射:

PUT /documents
{
    "mappings" : {
        "document" : {
            "properties" : {
                "filename" : { "type" : "text", "analyzer": "simple" }
            }
        }
    }
}

The simple analyzer will split on any non-letter, so _ and numbers will be treated as splits. 简单分析器将在任何非字母上进行拆分,因此_和数字将被视为拆分。 Depending on your application, you may need finer grained control over tokenization. 根据您的应用程序,您可能需要对标记化进行更精细的控制。 See the documentation . 请参阅文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM