I'm new to Elasticsearch, and I'm having trouble understanding why it does certain things. I have the following document structure indexed (I'm using Chewy in Rails, but it should make sense either way):
class OpportunityLocationsIndex < Chewy::Index
define_type OpportunityLocation.includes(:opportunity).joins(:opportunity => :company).where(:opportunities => {is_valid: true}) do
field :location
field :coordinates, type: 'geo_point'
field :opening_status
field :opportunity, type: 'object' do
field :name, :summary
field :opportunity_count, value: ->(o) { o.total_positions }
field :company, type: 'object' do
field :name
field :slug
field :industry
field :company_path, value: ->(c) { "/companies/" + c.slug }
field :logo_image, value: ->(c) { c.logo_image.url(:medium) }
field :logo_image_grey, value: ->(c) { c.logo_image.url(:greyscale) }
end
end
end
end
Now, say I want to get all documents with location of "Johannesburg, Gauteng, South Africa", I would run the following query:
GET _search
{
"query": {
"match": {
"location": "Johannesburg, Gauteng, South Africa"
}
}
}
Which would spit out the following.
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 13,
"max_score": 1.6014341,
"hits": [
{
"_index": "opportunity_locations",
"_type": "opportunity_location",
"_id": "56",
"_score": 1.6014341,
"_source": {
"location": "Johannesburg, Gauteng, South Africa",
"coordinates": "28.0473051, -26.2041028",
"opening_status": "closed",
"opportunity": {
"name": "Bentley Test Opportunity",
"summary": "Engineering at Bentley provides some unique and interesting challenges. The Interior Systems engineers...",
"opportunity_count": 6,
"company": {
"name": "Bentley Motors",
"slug": "bentley-motors",
"industry": "Automobile / Mechanical Engineering",
"company_path": "/companies/bentley-motors",
"logo_image": "/public/system/companies/logo_images/000/000/008/medium/bentley_logo_desktop_wallpaper-normal.jpg?1397906812",
"logo_image_grey": "/public/system/companies/logo_images/000/000/008/greyscale/bentley_logo_desktop_wallpaper-normal.jpg?1397906812"
}
}
}
},
{ etc. }
]
}
}
Right, so that works and makes sense that it works. Now, what if I want to get all documents that have the company name of "Bentley Motors" or "BMW", I try doing the following:
GET _search
{
"query": {
"terms": {
"opportunity.company.name": [
"Bentley Motors",
"BMW"
]
}
}
}
Which returns zero results. What am I doing wrong?
It's related to how you index your data and then how you query it.
Your first request use a match query which is intelligent enough to determine if it must analyze or not your data, depending on how you've mapped your document type.
Your second request use a term query which doesn't use any analyzer, and search for the exact same value in the inverted index.
For example, if you index a string TEST
, with default mapping :
TEST
will output no result TEST
will return your document, because it will analyze the text the same way than at index time. In your case, when you have indexed your document, this field value has been analyzed using the standard analyzer, which have transformed your value Bentley Motors
into two separate terms bentley
and motors
.
You can check this by using only bentley
or motors
in your terms query : you will find your document.
Then try to change your second request to use a match query with Bentley Motors
: your should retrieve your document too.
If you want to use a terms query for your second request, you must set the mapping of your company name field to not_analyzed .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.