简体   繁体   中英

MongoDB query based on multiple joins

There are many questions about joins and MongoDB but a lot of them have outdated answers that do not take features after Mongo 3.x into consideration. My question is how would you query a table with conditions on linked elements?

Here is an extremely simplified example

const Person =  new mongoose.Schema({
  gender: String
});

const Dog =  new mongoose.Schema({
  breed: String
});

const Team =  new mongoose.Schema({
  trainer: {
        type: mongoose.Schema.ObjectId,
        ref: 'Person'
  },
  members: [{
        type: mongoose.Schema.ObjectId,
        ref: 'Dog'
  }]
})

Imagine this is already in production and changing the schema is not a possibility.

How could I retrieve all teams with at least one member dog of the "Poodle" breed AND where the trainer's gender is "male" ?

Putting your question in a different way: How to join more than two collections in mongoDB?

Assuming your collection names against model names are dogs , teams and people (Mongoose convention of pluralizing), following is one of the ways to achieve the desired result:

Dog.aggregate([{
        $match: {
            breed: "Poodle"
        }
    },
    {
        $lookup: {
            from: "teams",
            localField: "_id",
            foreignField: "members",
            as: "team"
        }
    },
    {
        $unwind: "$team"
    },
    {
        $lookup: {
            from: "people",
            localField: "team.trainer",
            foreignField: "_id",
            as: "trainer"
        }
    },

    {
        $unwind: "$trainer"
    },

    {
        $match: {
            "trainer.gender": "male"
        }
    },
    {
        $project: {
            breed: 1,
            trainer: 1,
            team: {
                _id: 1
            }
        }
    }
], function(err, teams) {
   console.log(teams)
});

In the aggregation pipeline we do the following:

  1. Take Dog as a starting point and match the breed
  2. Then use $lookup to join the results with teams and fetch those teams that contain member reference to "Poodle"
  3. The result set from 2 contains array of team (you can remove all the steps below the $lookup to see the state of results). To split this array into another document we use $unwind operator (team of say three elements will become three documents with parent fields replicated in all)
  4. On the new result set, apply $lookup again, this time joining people. This puts people in trainer array.
  5. Again unwind to split trainer
  6. Match the result set for trainer.gender "male"
  7. $project (select) fields that you need

The final result will look something like this:

{
    "_id" : ObjectId("596e5500b5174986059958a8"),
    "breed" : "Poodle",
    "team" : {
        "_id" : ObjectId("596e564fb5174986059958de")
    },
    "trainer" : {
        "_id" : ObjectId("596e54bfb51749860599589c"),
        "gender" : "male"
    }
}


{
    "_id" : ObjectId("596e5500b5174986059958a8"),
    "breed" : "Poodle",
    "team" : {
        "_id" : ObjectId("596e564fb5174986059958e6")
    },
    "trainer" : {
        "_id" : ObjectId("596e54bfb51749860599589c"),
        "gender" : "male"
    }
}


{
    "_id" : ObjectId("596e5500b5174986059958b2"),
    "breed" : "Poodle",
    "team" : {
        "_id" : ObjectId("596e564fb5174986059958de")
    },
    "trainer" : {
        "_id" : ObjectId("596e54bfb51749860599589c"),
        "gender" : "male"
    }
}


{
    "_id" : ObjectId("596e5500b5174986059958b2"),
    "breed" : "Poodle",
    "team" : {
        "_id" : ObjectId("596e564fb5174986059958e6")
    },
    "trainer" : {
        "_id" : ObjectId("596e54bfb51749860599589c"),
        "gender" : "male"
    }
}

Essentially, we have searched Dog and joined and matched more collections along the way. The root _id in the final documents is the _id of dog not team, so technically the result set contains dogs containing teams and trainers but you can regard them as "team" document. You could go the other way around, starting from Person and reaching Dog .

Also, the structure of the result is not perfect. You'd probably want a well structured format as population does like teams containing embedded trainer and members . With some tweaks in the aggregation pipeline, I'm sure a well formed structure can be achieved.

Lastly, this is different than Mongoose population, which is suggested in another answer. The major difference is that in this case you've delegated the task of finding the required documents to the mongo server, and obviously in one go. In population, the same would need too much client side processing and many requests to db. But $lookup works on unsharded collection, in which case you may prefer population or consider this answer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM