简体   繁体   中英

How can I query a MongoDB collection by both a geo spatial index and a text index quickly?

Given the collection locations consisting of ~20,000,000 documents with 3 properties:

{
    _id,
    name, // string
    geo // coordinate pair, e.g. [-90.123456, 30.123456]
}

and an index of name: 1 and a geo index setup like so:

{ 
    "geo" : "2dsphere"
},
{ 
    "v" : 1, 
    "name" : "geo_2dsphere", 
    "ns" : "db.locations", 
    "min" : "-180.0", 
    "max" : "180.0", 
    "w" : 1.0, 
    "2dsphereIndexVersion" : 2
}

How can I performantly query against this collection both on the geo_2dsphere index and on the name index?

When I run a $box query on the geo index only, it takes over 20 seconds to return 50 results. When I include a search against the name property it goes up even further.

If I run a $near query, then things can perform very quickly, but sometimes queries seem to (very randomly) go from ~200ms to many seconds. See this example where the only difference is one additional character on the name index which actually increases the time:

200ms:

{name: /^mac/, geo: {$near: {$geometry: {type: "Point", coordinates: [ -90.123456, 30.123456 ]}, $maxDistance: 20000}}}

18,000ms:

 {name: /^macy/, geo: {$near: {$geometry: {type: "Point", coordinates: [ -90.123456, 30.123456 ]}, $maxDistance: 20000}}}

I can't understand why being more specific with an index is slowing things down so much. When I get more specific with a phrase, I have to drastically reduce the $maxDistance to something like 7,000 meters before the query returns in any reasonable amount of time.

Is there a better setup I should be doing here?

As has been pointed out to me by Blakes Seven, you cannot search across multiple indexes in MongoDB:

There is a "highlander rule" (there can be only one) in the query evaluation that denies the usage of more than "one" "special" index in a query evaluation. So you cannot have multiple "text" or muliple "geospatial" or any combination of "text" and "geospatial" or usage of any of those within an $or condition, that results in multiple index selection.

So, I've opted to move over to Elasticsearch for this specific query, indexing only what I need to complete these multi-index queries, and then use those results to load the necessary Mongo documents. Works quickly, works well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM