简体   繁体   中英

Mongoose MongoDB GeoSpatial Query

I have an Item collection which could hold thousands to hundreds of thousands of documents. On that collection, I want to perform Geospatial queries. Using Mongoose, there are two options - find() and the Aggregation Pipeline. I have displayed my implementations of both below:

Mongoose Model

To start, here are the relevant properties of my Mongoose Model:

// Define the schema
const itemSchema = new mongoose.Schema({
    // Firebase UID (in addition to the Mongo ObjectID)
    owner: {
        type: String,
        required: true,
        ref: 'User'
    },
    // ... Some more fields
    numberOfViews: {
        type: Number,
        required: true,
        default: 0
    },
    numberOfLikes: {
        type: Number,
        required: true, 
        default: 0
    },
    location: {
        type: {
            type: 'String',
            default: 'Point',
            required: true
        },
        coordinates: {
            type: [Number],
            required: true,
        },
    }
}, {
    timestamps: true
});

// 2dsphere index
itemSchema.index({ "location": "2dsphere" });

// Create the model
const Item = mongoose.model('Item', itemSchema);

Find Query

// These variables are populated based on URL Query Parameters.
const match = {};
const sort = {};

// Query to make.
const query = {
    location: {
        $near: {
            $maxDistance: parseInt(req.query.maxDistance),
            $geometry: {
                type: 'Point',
                coordinates: [parseInt(req.query.lng), parseInt(req.query.lat)]
            }
        }
    },
    ...match
};

// Pagination and Sorting
const options = {
    limit: parseInt(req.query.limit),
    skip: parseInt(req.query.skip),
    sort
};

const items = await Item.find(query, undefined, options).lean().exec();

res.send(items);

Aggregation Pipeline

Suppose distance needed to be calculated:

// These variables are populated based on URL Query Parameters.
const query = {};
const sort = {};

const geoSpatialQuery = {
    $geoNear: {
        near: { 
            type: 'Point', 
            coordinates: [parseInt(req.query.lng), parseInt(req.query.lat)] 
        },
        distanceField: "distance",
        maxDistance: parseInt(req.query.maxDistance),
        query,
        spherical: true
    }
};

const items = await Item.aggregate([
    geoSpatialQuery,
    { $limit: parseInt(req.query.limit) },
    { $skip: parseInt(req.query.skip) },
    { $sort: { distance: -1, ...sort } } 
]).exec();

res.send(items);

Edit - Example Documented Amended

Here is an example of a document with all of its properties from the Item collection:

{
   "_id":"5cd08927c19d1dd118d39a2b",
   "imagePaths":{
      "standard":{
         "images":[
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-aafe69c7-f93e-411e-b75d-319042068921-standard.jpg",
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-397c95c6-fb10-4005-b511-692f991341fb-standard.jpg",
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-e54db72e-7613-433d-8d9b-8d2347440204-standard.jpg",
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-c767f54f-7d1e-4737-b0e7-c02ee5d8f1cf-standard.jpg"
         ],
         "profile":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-standard-profile.jpg"
      },
      "thumbnail":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-thumbnail.jpg",
      "medium":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-medium.jpg"
   },
   "location":{
      "type":"Point",
      "coordinates":[
         -110.8571443,
         35.4586858
      ]
   },
   "numberOfViews":0,
   "numberOfLikes":0,
   "monetarySellingAmount":9000,
   "exchangeCategories":[
      "Math"
    ],
   "itemCategories":[
      "Sports"
   ],
   "title":"My title",
   "itemDescription":"A description",
   "exchangeRadius":10,
   "owner":"zbYmcwsGhcU3LwROLWa4eC0RRgG3",
   "reports":[],
   "createdAt":"2019-05-06T19:21:13.217Z",
   "updatedAt":"2019-05-06T19:21:13.217Z",
   "__v":0
}

Questions

Based on the above, I wanted to ask a few questions.

  1. Is there a performance difference between my implementations of the normal Mongoose Query and the use of the Aggregation Pipeline?

  2. Is it correct to say that near and geoNear are pretty much similar to nearSphere when using the 2dsphere index with GeoJSON - except that geoNear provides extra data and default limiting? That is, although having different units, both queries - conceptually - would show relevant data within a specific radius from some location, despite the fact the field is called radius for nearSphere and maxDistance with near / geoNear .

  3. With my example above, how might the performance loss of using skip be mitigated but still be able to achieve pagination in both querying and aggregation?

  4. The find() function allows an optional parameter to determine which fields will be returned. The Aggregation Pipeline takes a $project stage to do the same. Is there a specific order where $project should be used in the pipeline to optimize speed/efficiency, or does it not matter?

I hope this style of question is permitted as per the Stack Overflow rules. Thank you.

I tried the below query with 2dsphere indexing.I used the aggregation pipeline
for the below query.

db.items.createIndex({location:"2dsphere"})

While using aggregation pipeline it gives you more flexibility on the result set. Also aggregation pipeline will improve the performance on running geo related searches.

db.items.aggregate([
{
 $geoNear: {
    near: { type: "Point", coordinates: [ -110.8571443 , 35.4586858 ] },
    key: "location",
    distanceField: "dist.calculated",
    minDistance: 2, 
    query: { "itemDescription": "A description" }
 }])

On your question on $skip below question will give you more insight on the $skip oepration $skip and $limit in aggregation framework

You can use $project accordingly to your need. In our case we didnt had much of performance issue using $project over 10 million of data

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM