簡體   English   中英

Elasticsearch多個值匹配,無需分析器

[英]Elasticsearch multiple values match without analyzer

請原諒我對ElasticSearch的了解。 我有一個Elasticsearch集合,其中包含以下文檔:

{
    "date": "2013-12-30T00:00:00.000Z",
    "value": 2,
    "dimensions": {
        "region": "Coimbra District"

    }
}
{
    "date": "2013-12-30T00:00:00.000Z",
    "value": 1,
    "dimensions": {
        "region": "Federal District"        
    }
}
{
    "date": "2013-12-30T00:00:00.000Z",
    "value": 1,
    "dimensions": {
        "region": "Masovian Voivodeship"
    }
}

這3個json文檔在ES服務器中編制索引。 我沒有提供任何分析器類型(並且不知道如何提供一個:) :)我使用彈簧數據Elasticsearch並執行以下查詢來搜索區域'Masovian Voivodeship'或'Federal District'的文檔:

{
  "query_string" : {
    "query" : "Masovian Voivodeship OR Federal District",
    "fields" : [ "dimensions.region" ]
  }
}

我期待它返回2次點擊。 但是,它會返回所有3個文檔(可能是因為第3個文檔中包含了區域)。 如何修改查詢以便它可以執行完全匹配並僅提供2個文檔? 我使用以下方法:

QueryBuilders.queryString(<OR string>).field("dimensions.region")

我嘗試過QueryBuilders.termsQueryQueryBuilders.inQueryQueryBuilders.matchQuery (帶數組),但沒有運氣。

有人可以幫忙嗎? 提前致謝。

你可以在這里做幾件事。

首先,我設置了一個沒有任何顯式映射或分析的索引,這意味着將使用標准分析器 這很重要,因為它決定了我們如何查詢文本字段。

所以我開始:

DELETE /test_index

PUT /test_index
{
   "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0
   }
}

PUT /test_index/doc/1
{
    "date": "2013-12-30T00:00:00.000Z",
    "value": 2,
    "dimensions": {
        "region": "Coimbra District"

    }
}

PUT /test_index/doc/2
{
    "date": "2013-12-30T00:00:00.000Z",
    "value": 1,
    "dimensions": {
        "region": "Federal District"        
    }
}

PUT /test_index/doc/3
{
    "date": "2013-12-30T00:00:00.000Z",
    "value": 1,
    "dimensions": {
        "region": "Masovian Voivodeship"
    }
}

然后我嘗試了你的查詢,沒有點擊。 我不明白為什么你的fields參數中有"dimensions.ga:region" ,但是當我把它改成"dimensions.region"我得到了一些結果:

POST /test_index/doc/_search
{
   "query": {
      "query_string": {
         "query": "Masovian Voivodeship OR Federal District",
         "fields": [
            "dimensions.region"
         ]
      }
   }
}
...
{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 0.46911472,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "3",
            "_score": 0.46911472,
            "_source": {
               "date": "2013-12-30T00:00:00.000Z",
               "value": 1,
               "dimensions": {
                  "region": "Masovian Voivodeship"
               }
            }
         },
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "2",
            "_score": 0.3533006,
            "_source": {
               "date": "2013-12-30T00:00:00.000Z",
               "value": 1,
               "dimensions": {
                  "region": "Federal District"
               }
            }
         },
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "1",
            "_score": 0.05937162,
            "_source": {
               "date": "2013-12-30T00:00:00.000Z",
               "value": 2,
               "dimensions": {
                  "region": "Coimbra District"
               }
            }
         }
      ]
   }
}

但是,這會返回您不想要的結果。 解決這個問題的一種方法如下:

POST /test_index/doc/_search
{
   "query": {
      "query_string": {
         "query": "(Masovian AND Voivodeship) OR (Federal AND District)",
         "fields": [
            "dimensions.region"
         ]
      }
   }
}
...
{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 0.46911472,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "3",
            "_score": 0.46911472,
            "_source": {
               "date": "2013-12-30T00:00:00.000Z",
               "value": 1,
               "dimensions": {
                  "region": "Masovian Voivodeship"
               }
            }
         },
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "2",
            "_score": 0.3533006,
            "_source": {
               "date": "2013-12-30T00:00:00.000Z",
               "value": 1,
               "dimensions": {
                  "region": "Federal District"
               }
            }
         }
      ]
   }
}

另一種方法(我更喜歡這個)給出了相同的結果是使用匹配查詢布爾的組合應該

POST /test_index/doc/_search
{
   "query": {
      "bool": {
         "should": [
            {
               "match": {
                  "dimensions.region": {
                     "query": "Masovian Voivodeship",
                     "operator": "and"
                  }
               }
            },
            {
               "match": {
                  "dimensions.region": {
                     "query": "Federal District",
                     "operator": "and"
                  }
               }
            }
         ]
      }
   }
}

這是我使用的代碼:

http://sense.qbox.io/gist/bb5062a635c4f9519a411fdd3c8540eae8bdfd51

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM