简体   繁体   English

MongoDB:查找具有给定子文档数组的文档

[英]MongoDB: find documents with a given array of subdocuments

I want to find documents which contain given subdocuments, let's say I have the following documents in my commits collection: 我想查找包含给定子文档的文档,假设我的commits集合中有以下文档:

// Document 1
{ 
  "commit": 1,
  "authors" : [
    {"name" : "Joe", "lastname" : "Doe"},
    {"name" : "Joe", "lastname" : "Doe"}
  ] 
}

// Document 2
{ 
  "commit": 2,
  "authors" : [
    {"name" : "Joe", "lastname" : "Doe"},
    {"name" : "John", "lastname" : "Smith"}
  ] 
}

// Document 3
{ 
  "commit": 3,
  "authors" : [
    {"name" : "Joe", "lastname" : "Doe"}
  ] 
}

All I want from the above collection is 1st document, since I know I'm looking for a commit with 2 authors were both have same name and lastname . 我希望从上述集合中获得的只是第一个文档,因为我知道我正在寻找一个commit ,两位authors namelastname都相同。 So I came up with the query: db.commits.find({ $and: [{'authors': {$elemMatch: {'name': 'Joe, 'lastname': 'Doe'}}, {'authors': {$elemMatch: {'name': 'Joe, 'lastname': 'Doe'}}], 'authors': { $size: 2 } }) 因此,我提出了以下查询: db.commits.find({ $and: [{'authors': {$elemMatch: {'name': 'Joe, 'lastname': 'Doe'}}, {'authors': {$elemMatch: {'name': 'Joe, 'lastname': 'Doe'}}], 'authors': { $size: 2 } })

$size is used to filter out 3rd document, but the query still returns 2nd document since both $elemMatch return True. $size用于过滤第3个文档,但是由于两个$elemMatch返回True,因此查询仍然返回第2个文档。

I can't use index on subdocuments, since the order of authors used for search is random. 我不能在子文档上使用索引,因为用于搜索的作者顺序是随机的。 Is there a way to remove 2nd document from results without using Mongo's aggregate function? 有没有一种方法可以在不使用Mongo的聚合函数的情况下从结果中删除第二个文档?

What you are asking for here is a little different from a standard query. 您在此处要求的内容与标准查询略有不同。 In fact you are asking for where the "name" and "lastname" is found in that combination in your array two times or more to identify that document. 实际上,您要问两次或多次在数组中的该组合中找到“名称”和“姓氏”以标识该文档。

Standard query arguments do not match "how many times" an array element is matched within a result. 标准查询参数与结果中数组元素匹配的“次数”不匹配。 But of course you can ask the server to "count" that for you using the aggregation framework : 但是当然,您可以要求服务器使用聚合框架为您“计数”:

db.collection.aggregate([
    // Match possible documents to reduce the pipeline
    { "$match": {
        "authors": { "$elemMatch": { "name": "Joe", "lastname": "Doe" } }
    }},

    // Unwind the array elements for processing
    { "$unwind": "$authors" },

    // Group back and "count" the matching elements
    { "$group": {
        "_id": "$_id",
        "commit": { "$first": "$commit" },
        "authors": { "$push": "$authors" },
        "count": { "$sum": {
            "$cond": [
                { "$and": [
                    { "$eq": [ "$authors.name", "Joe" ] },
                    { "$eq": [ "$authors.lastname", "Doe" ] }
                ]},
                1,
                0
            ]
        }}
    }},

    // Filter out anything that didn't match at least twice
    { "$match": { "count": { "$gte": 2 } } }
])

So essentially you but your conditions to match inside the $cond operator which returns 1 where matched and 0 where not, and this is passed to $sum to get a total for the document. 因此,从本质$cond ,您是要在$cond运算符内进行匹配的条件,该$cond返回1表示匹配, 0表示不匹配,然后将其传递给$sum以获取文档的总计。

Then filter out any documents that did not match 2 or more times 然后过滤出不匹配两次或更多次的所有文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM