简体   繁体   English

MongoDB更新嵌套数组foreach

[英]MongoDB update nested array foreach

I have collection of Users, and each user has an array Ancestors, previous developer did wrong DB architecture, and now each of ancestors is string but must be an ObjectId . 我收集了Users,每个用户都有一个数组Ancestors,以前的开发人员做错了DB体系结构,现在每个祖先都是string,但是必须是ObjectId It still contain objectId(in fact HEX of object Id, like 558470744a73274db0f0d65d ). 它仍然包含objectId(实际上是对象ID的十六进制,例如558470744a73274db0f0d65d )。 How can I convert each of ancestors to ObjectId? 如何将每个祖先转换为ObjectId? I wrote this: 我这样写:

db.getCollection('Users').find({}).forEach(function(item){
  if (item.Ancestors instanceof Array){
      var tmp = new Array()
      item.Ancestors.forEach(function(ancestor){
          if (ancestor instanceof String){
               tmp.push(ObjectId(ancestor))
             }
          })
          item.Ancestors = tmp
          db.getCollection('Users').save(item) 
  }
})

But looks like it works not properly, and some of ancestors now is ObjectId, and some null . 但是看起来它无法正常工作,现在有些祖先是ObjectId,有些是null And also ancestors can be null from start. 而且祖先从一开始就可以为空。 so I put all that if 's 所以我把所有的if

Try like this using mongoose, 像猫鼬一样尝试

var mongoose = require('mongoose');

db.getCollection('Users').find({}).forEach(function(item){
  if (item.Ancestors instanceof Array){
      var tmp = new Array()
      item.Ancestors.forEach(function(ancestor){
          if (ancestor instanceof String){
               tmp.push(mongoose.Types.ObjectId(ancestor))
             }
          })
          item.Ancestors = tmp
          db.getCollection('Users').save(item) 
  }
})

The solution concept here is to loop through your collection with a cursor and for each document within the cursor, gather data about the index position of the Ancestors array elements. 这里的解决方案概念是使用游标遍历您的集合,并为游标中的每个文档收集有关Ancestors数组元素的索引位置的数据。

You will then use this data later on in the loop as the update operation parameters to correctly identify the elements to update. 然后,您稍后将在循环中将此数据用作更新操作参数,以正确标识要更新的元素。

Supposing your collection is not that humongous, the intuition above can be implemented using the forEach() method of the cursor as you have done in your attempts to do the iteration and getting the index data for all the array elements involved. 假设您的集合不是那么笨拙,可以使用游标的forEach()方法实现上述直觉,就像您尝试进行迭代并获取所有涉及的数组元素的索引数据一样。

The following demonstrates this approach for small datasets: 下面展示了针对小型数据集的这种方法:

function isValidHexStr(id) {
    var checkForHexRegExp = new RegExp("^[0-9a-fA-F]{24}$");
    if(id == null) return false;
    if(typeof id == "string") {
        return id.length == 12 || (id.length == 24 && checkForHexRegExp.test(id));
    }
    return false;
};


db.users.find({"Ancestors.0": { "$exists": true, "$type": 2 }}).forEach(function(doc){ 
    var ancestors = doc.Ancestors,
        updateOperatorDocument = {}; 
    for (var idx = 0; idx < ancestors.length; idx++){ 
        if(isValidHexStr(ancestors[idx]))                   
            updateOperatorDocument["Ancestors."+ idx] = ObjectId(ancestors[idx]);           
    };  
    db.users.updateOne(
        { "_id": doc._id },
        { "$set": updateOperatorDocument }
    );      
});

Now for improved performance especially when dealing with large collections, take advantage of using a Bulk() API for updating the collection in bulk. 现在,为了提高性能,尤其是在处理大型集合时,请利用Bulk() API Bulk()更新集合。 This is quite effecient as opposed to the above operations because with the bulp API you will be sending the operations to the server in batches (for example, say a batch size of 1000) which gives you much better performance since you won't be sending every request to the server but just once in every 1000 requests, thus making your updates more efficient and quicker. 与上述操作相反,这非常有效,因为使用bulp API时,您将分批将操作发送到服务器(例如,批量大小为1000),这将为您提供更好的性能,因为您不会发送对服务器的每个请求,但每1000个请求中只有一个,因此使您的更新更高效,更快。

The following examples demonstrate using the Bulk() API available in MongoDB versions >= 2.6 and < 3.2 . 以下示例演示了如何使用MongoDB版本>= 2.6< 3.2可用的Bulk() API。

function isValidHexStr(id) {
    var checkForHexRegExp = new RegExp("^[0-9a-fA-F]{24}$");
    if(id == null) return false;
    if(typeof id == "string") {
        return id.length == 12 || (id.length == 24 && checkForHexRegExp.test(id));
    }
    return false;
};

var bulkUpdateOps = db.users.initializeUnOrderedBulkOp(), 
    counter = 0;

db.users.find({"Ancestors.0": { "$exists": true, "$type": 2 }}).forEach(function(doc){ 
    var ancestors = doc.Ancestors,
        updateOperatorDocument = {}; 
    for (var idx = 0; idx < ancestors.length; idx++){ 
        if(isValidHexStr(ancestors[idx]))                   
            updateOperatorDocument["Ancestors."+ idx] = ObjectId(ancestors[idx]);           
    };
    bulkUpdateOps.find({ "_id": doc._id }).update({ "$set": updateOperatorDocument })

    counter++;  // increment counter for batch limit
    if (counter % 1000 == 0) { 
        // execute the bulk update operation in batches of 1000
        bulkUpdateOps.execute(); 
        // Re-initialize the bulk update operations object
        bulkUpdateOps = db.users.initializeUnOrderedBulkOp();
    } 
})

// Clean up remaining operation in the queue
if (counter % 1000 != 0) { bulkUpdateOps.execute(); }

The next example applies to the new MongoDB version 3.2 which has since deprecated the Bulk() API and provided a newer set of apis using bulkWrite() . 下一个示例适用于新的MongoDB 3.2版,此版本已弃用 Bulk() API,并使用bulkWrite()提供了一组较新的api。

It uses the same cursors as above but creates the arrays with the bulk operations using the same forEach() cursor method to push each bulk write document to the array. 它使用与上面相同的游标,但是使用相同的forEach()游标方法通过批量操作创建数组,以将每个批量写入文档推入数组。 Because write commands can accept no more than 1000 operations, you will need to group your operations to have at most 1000 operations and re-intialise the array when the loop hits the 1000 iteration: 由于写命令最多只能接受1000个操作,因此您需要将操作分组以最多具有1000个操作,并在循环达到1000次迭代时重新初始化数组:

var cursor = db.users.find({"Ancestors.0": { "$exists": true, "$type": 2 }}),
    bulkUpdateOps = [];

cursor.forEach(function(doc){ 
    var ancestors = doc.Ancestors,
        updateOperatorDocument = {}; 
    for (var idx = 0; idx < ancestors.length; idx++){ 
        if(isValidHexStr(ancestors[idx]))                   
            updateOperatorDocument["Ancestors."+ idx] = ObjectId(ancestors[idx]);           
    };
    bulkUpdateOps.push({ 
        "updateOne": {
            "filter": { "_id": doc._id },
            "update": { "$set": updateOperatorDocument }
         }
    });

    if (bulkUpdateOps.length == 1000) {
        db.users.bulkWrite(bulkUpdateOps);
        bulkUpdateOps = [];
    }
});         

if (bulkUpdateOps.length > 0) { db.users.bulkWrite(bulkUpdateOps); }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM