数组中的$ pull对象，另一个数组中的$ pull引用

Question

Consider this document from a collection of clients: 请参考以下客户的文档：

client: {
  services: [
    {
      _id: 111,
      someField: 'someVal'
    },
    {
      _id: 222,
      someField: 'someVal'
    }
    ... // More services
  ]

  staff: [
    {
      _id: 'aaa',
      someField: 'someVal',
      servicesProvided: [111, 222, 333, ...]
    },
    {
      _id: 'bbb',
      someField: 'someVal',
      servicesProvided: [111, 555, 666, ...]
    },
    {
      _id: 'ccc',
      someField: 'someVal',
      servicesProvided: [111, 888, 999, ...]
    }
    ... // More staff
  ]
}

A client can have many staff members. 一个客户可以有很多工作人员。 Each staff has a reference to the services he or she provide. 每个员工都参考其提供的服务。 If a service is deleted, reference to this service also need to be deleted in all staff. 如果删除了一项服务，则所有工作人员中也都需要删除对该服务的引用。

I want to delete (pull) an object (a service) from services , and in the same query delete possible reference in the servicesProvided in all staff objects` 我想从services删除（拉出）对象（服务），并在同一查询中删除所有 staff对象中servicesProvided中可能的引用。

For example, if I delete service with _id 111, I also want to delete all references to this service in staffmembers that provide this service. 例如，如果我删除带有_id 111的服务，我也想删除提供此服务的工作人员中对此服务的所有引用。

How do i write this query. 我如何编写此查询。

Answer 1

So this is where things get a little nasty. 因此，这里有点令人讨厌。 How indeed do you update "multiple" array items that would match the conditions in a single document? 确实如何更新与单个文档中的条件匹配的“多个”数组项？

A bit of background here comes from the positional $ operator documentation : 这里有一些背景信息来自$运算符的位置文档：

Nested Arrays The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value 嵌套数组位置$运算符不能用于遍历一个以上数组的查询，例如遍历嵌套在其他数组中的数组的查询，因为$占位符的替换是单个值

That tells "part" of the story, but the main point here that is specific to this question is "more that one". 这说明了故事的“一部分”，但这里针对该问题的主要观点是“不止一个”。

So even though the "nested" part is not explicitly true due to what needs to be done, the important factor is "more than one". 因此，即使“嵌套”部分由于需要执行的操作而未明确显示为true ，重要的因素还是“多个”。 To demonstrate, lets consider this: 为了演示，让我们考虑一下：

{
  services: [
    {
      _id: 111,
      someField: 'someVal'
    },
    {
      _id: 222,
      someField: 'someVal'
    }
  ],

  staff: [
    {
      _id: 'aaa',
      someField: 'someVal',
      servicesProvided: [111, 222, 333, ...]
    },
    {
      _id: 'bbb',
      someField: 'someVal',
      servicesProvided: [111, 555, 666, ...]
    },
    {
      _id: 'ccc',
      someField: 'someVal',
      servicesProvided: [111, 888, 999, ...]
    }
  ]
}

Now you ask to remove the 111 value. 现在，您要求删除111值。 This is always the "first" value as provided in your example. 始终是示例中提供的“第一个”值。 So where we can assume this to be the case then the update is "what seems to be: simple: 因此，如果可以假设是这种情况，则更新为“看起来很简单：

 db.collection.update(
     { 
         "_id": ObjectId("542ea4991cf4ad425615b84f"),
     },
     { 
         "$pull": {
             "services": { "_id": 111 },
             "staff.servicesProvided": 111
         }
     }
 )

But. 但。 That won't do what you expect as the elements will not be pulled from all "staff" array elements as you might expect. 这将无法实现您期望的效果，因为不会像您期望的那样从所有 “ staff”数组元素中提取元素。 In fact, none of them. 实际上，它们都不是。 The only thing that will work is this: 唯一起作用的是：

 db.collection.update(
     { 
         "_id": ObjectId("542ea4991cf4ad425615b84f"),
         "staff.servicesProvided": 111
     },
     { 
         "$pull": {
             "services": { "_id": 111 },
             "staff.$.servicesProvided": 111
         }
     }
 )

But guess what! 但猜猜怎么了！ Only the "first" array element was actually updated. 实际上只有“第一个”数组元素被更新。 So when you look at the statement above, this is basically what it says will happen. 因此，当您查看上面的声明时，基本上就是它所说的将会发生。

Then again though, suppose we were just testing this in a modern MongoDB shell with a server of MongoDB 2.6 version or greater. 再说一次，假设我们只是在具有MongoDB 2.6版本或更高版本的服务器的现代MongoDB Shell中对此进行了测试。 Then this is the response we get: 这就是我们得到的响应：

WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

So hang on a moment. 所以等一下。 We were just told how many documents were "modified" by the last statement. 上一条语句告诉我们多少文件被“修改”。 So even though we can only change one element of the array at a time, there is some important feedback here to the condition. 因此，即使我们一次只能更改数组的一个元素，但这里还是有一些重要的条件反馈。

The really great thing about the new "WriteResult" objects obtained from "Bulk Operations API" operations, which in fact this is doing in the shell, is that you actually get told if something was "modified" by the previous statement or not. 从“批量操作API”操作中获得的新“ WriteResult”对象的确是一件很棒的事情，实际上这是在Shell中完成的，实际上是告诉您是否已通过前面的语句“修改”了某些内容。 Way better than the "legacy" write responses, this now gives us a grounding to make some important decisions on looping considerations. 比“传统”写响应好得多，这现在为我们提供了在循环考虑方面做出一些重要决策的基础。 Such as "Did our last operation actually 'modify' a document, and then should we continue?" 例如“我们的上一个操作是否实际上“修改”了文档，然后我们应该继续吗？”

So this is an important "flow control" point, even if the general MongoDB API itself cannot just "update all elements" all at once. 因此，即使通用的MongoDB API本身不能一次全部“更新所有元素”，这也是重要的“流程控制”点。 Now there is a testable case to decide where to "continue" in a loop or not. 现在有一个可测试的案例来决定在循环中在何处“继续”。 This is what I mean finally by "combining" what you have already learned. 这就是我最后的意思是“结合”您已经学到的东西。 So eventually we can come to a listing like this: 因此，最终我们可以列出这样的清单：

var bulk = db.collection.initializeOrderedBulkOp();
var modified = 1;

async.whilst(
    function() { return modified },
    function(callback) {
        bulk.find(
            { 
                "_id": ObjectId("542ea4991cf4ad425615b84f"),
                "staff.servicesProvided": 111
            }
        ).updateOne(
            { 
                "$pull": {
                     "services": { "_id": 111 },
                     "staff.$.servicesProvided": 111
                }
            }
        );

        bulk.execute(function(err,result) {
            modified = result.nModfified();
            callback(err);
        });
    },
    function(err) {
      // did I throw something! Suppose I should so something about it!
    }
);

Or basically something cute like that. 或基本上像这样的可爱东西。 So you are asking for the "result" object obtained from the "bulk operations" .execute() to tell you if something was modified or not. 因此，您需要从“批量操作” .execute()获得的“结果”对象，以告诉您是否已对某些内容进行了修改。 Where it still was, then you are "re-iterating" the loop again here and performing the same update and asking for the result again. 仍然存在，然后您在此处再次“重复”循环，并执行相同的更新并再次请求结果。

Eventually, the update operation will tell you that "nothing" was modified at all. 最终，更新操作将告诉您“什么都没有修改”。 This is when you exit the loop and continue normal operations. 这是您退出循环并继续正常操作的时间。

Now an alternate way to handle this might well be to read in the entire object and then make all the modifications that you require: 现在，解决此问题的另一种方法可能是读取整个对象，然后进行所需的所有修改：

db.collection.findOne(
    { 
        "_id": ObjectId("542ea4991cf4ad425615b84f"),
        "staff.servicesProvided": 111
    },
    function(err,doc) {
        doc.services = doc.services.filter(function(item) {
            return item._id != 111;
        });

        doc.staff = doc.staff.filter(function(item) {
            item.serviceProvided = item.servicesProvided.filter(function(sub) {
                return sub != 111;
            });
            return item;
        });
       db.collection.save( doc );
    }
);

Bit of overkill. 有点过分了。 Not entirely atomic, but close enough for measure. 并非完全原子，但足够接近以进行测量。

So you cannot really do this in a single write operation, at least without dealing with "reading" the document and then "writing" the whole thing back after modifying the content. 因此，至少在不处理“读取”文档然后在修改内容后将整个内容“写回”的情况下，您实际上无法在单个写入操作中真正做到这一点。 But you can take and "iterative" approach, and there are the tools around to allow you to control that. 但是您可以采用“迭代”方法，并且周围有工具可以控制它。

Another possible way to approach this is to change the way you model like this: 解决此问题的另一种可能方法是更改建模方式，如下所示：

{
  "services": [
    {
      "_id": 111,
      "someField": "someVal"
    },
    {
      "_id": 222,
      "someField": "someVal"
    }
  ],

  "provided": [ 
      { "_id": "aaa", "service": 111 },
      { "_id": "aaa", "service": 222 },
      { "_id": "aaa", "service": 111 }
  ]
}

And so on. 等等。 So then the query becomes something like this: 这样查询就会变成这样：

db.collection.update(
    {  "_id": ObjectId("542ea4991cf4ad425615b84f") },
    {
        "$pull": {
            "services": { "_id": 111 },
            "provided": { "_id": 111 }
        }
    }
);

And that truly would be a singular update operation that removes everything in one go because each element is contained in singular arrays. 真正地，这将是一次单数更新操作，因为每个元素都包含在单数数组中，因此可以一次性删除所有内容。

So there are ways to do it, but how you model really depends on your application data access patterns. 因此，有很多方法可以做到，但是建模的方式实际上取决于您的应用程序数据访问模式。 Choose the solution that suits you best. 选择最适合您的解决方案。 This is why you choose MongoDB in the first place. 这就是为什么首先选择MongoDB的原因。

数组中的$ pull对象，另一个数组中的$ pull引用

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-10-03 15:35:30

数组中的$ pull对象，另一个数组中的$ pull引用

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-10-03 15:35:30

解决方案1
1 已采纳 2014-10-03 15:35:30