使用ElasticSearch的多術語嵌套文檔查詢

Question

我是Elasticsearch的新手，無法理解為什么它會執行某些操作。 我將以下文檔結構編入索引（我在Rails中使用Chewy，但無論哪種方式都應該有意義）：

class OpportunityLocationsIndex < Chewy::Index
  define_type OpportunityLocation.includes(:opportunity).joins(:opportunity => :company).where(:opportunities => {is_valid: true}) do
    field :location
    field :coordinates, type: 'geo_point'
    field :opening_status

    field :opportunity, type: 'object' do
      field :name, :summary
      field :opportunity_count, value: ->(o) { o.total_positions }

      field :company, type: 'object' do
        field :name
        field :slug
        field :industry

        field :company_path, value: ->(c) { "/companies/" + c.slug }
        field :logo_image, value: ->(c) { c.logo_image.url(:medium) }
        field :logo_image_grey, value: ->(c) { c.logo_image.url(:greyscale) }
      end
    end
  end
end

現在，假設我要獲取所有位置為“約翰內斯堡，豪登省，南非”的文檔，我將運行以下查詢：

GET _search
{
    "query": {
        "match": {
           "location": "Johannesburg, Gauteng, South Africa"
        }
    }
}

這將吐出以下內容。

  {
     "took": 7,
     "timed_out": false,
     "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
     },
     "hits": {
        "total": 13,
        "max_score": 1.6014341,
        "hits": [
           {
              "_index": "opportunity_locations",
              "_type": "opportunity_location",
              "_id": "56",
              "_score": 1.6014341,
              "_source": {
                 "location": "Johannesburg, Gauteng, South Africa",
                 "coordinates": "28.0473051, -26.2041028",
                 "opening_status": "closed",
                 "opportunity": {
                    "name": "Bentley Test Opportunity",
                    "summary": "Engineering at Bentley provides some unique and interesting challenges. The Interior Systems engineers...",
                    "opportunity_count": 6,
                    "company": {
                       "name": "Bentley Motors",
                       "slug": "bentley-motors",
                       "industry": "Automobile / Mechanical Engineering",
                       "company_path": "/companies/bentley-motors",
                       "logo_image": "/public/system/companies/logo_images/000/000/008/medium/bentley_logo_desktop_wallpaper-normal.jpg?1397906812",
                       "logo_image_grey": "/public/system/companies/logo_images/000/000/008/greyscale/bentley_logo_desktop_wallpaper-normal.jpg?1397906812"
                    }
                 }
              }
           },
           { etc. }
        ]
     }
  }

是的，這樣才行得通，並且有意義。 現在，如果要獲取所有公司名稱為“ Bentley Motors”或“ BMW”的文檔，我將嘗試執行以下操作：

GET _search
{
    "query": {
        "terms": {
           "opportunity.company.name": [
              "Bentley Motors",
              "BMW"
           ]
        }
    }
}

返回零結果。 我究竟做錯了什么？

Answer 1

它與如何索引數據以及如何查詢數據有關。

您的第一個請求使用匹配查詢，該查詢足夠智能，可以根據您映射文檔類型的方式來確定是否必須分析數據。

您的第二個請求使用不使用任何分析器的術語查詢，並在倒排索引中搜索完全相同的值。

例如，如果您使用默認映射對字符串TEST索引：

帶有TEST的字詞查詢將不輸出結果
使用TEST進行的匹配查詢將返回您的文檔，因為它將以與索引時相同的方式分析文本。

在您的情況下，當您為文檔建立索引時，已使用標准分析器對該字段值進行了分析，這將您的值Bentley Motors轉換為兩個單獨的術語bentley和motors 。

您可以通過使用只檢查這個bentley或motors在你的條件查詢：您會發現您的文檔。

然后嘗試更改您的第二個請求以對Bentley Motors使用匹配查詢：您也應該檢索您的文檔。

如果要對第二個請求使用術語查詢，則必須將公司名稱字段的映射設置為not_analyzed 。

使用ElasticSearch的多術語嵌套文檔查詢

問題描述

1 個解決方案

解決方案1
3 已采納 2014-08-20 14:25:45

使用ElasticSearch的多術語嵌套文檔查詢

問題描述

1 個解決方案

解決方案1 3 已采納 2014-08-20 14:25:45

解決方案1
3 已采納 2014-08-20 14:25:45