简体   繁体   English

MongoDB中字段类型的批量更改

[英]Bulk Change for a field type in MongoDB

I need to change a field type from int32 to string. 我需要将字段类型从int32更改为string。 This change is for data in a server and for a huge amount of documents. 此更改适用于服务器中的数据和大量文档。

With a simple update like the following only a small part of the documents get updated due to time issues: 通过以下类似的简单更新,由于时间问题,只有一小部分文档得到更新:

db.collection.find({"identifier": {$exists:true}})
    .forEach( function(x) {
        db.collection.update({_id: x._id}, {$set: {"identifier": 
        x.identifier.toString()}});
    }
);

So I decided to do a bulk change: 因此,我决定进行批量更改:

var bulk = db.collection.initializeUnorderedBulkOp();
bulk.find({"identifier": {$exists:true}}).update(
    function(x) {
        {_id: x._id}, {$set: {"identifier": x.identifier.toString()}}
    });
bulk.execute();

But it gives an error and does not get executed. 但是它给出了一个错误并且没有被执行。

How should I do the update for the bulk to work? 我应该如何进行更新才能使批量工作?

There is no bulk update where you can define a function within the official docs . 没有批量更新,您可以在官方文档中定义功能。 What you can do yourself is recreate the bulk operation by using skip and limit . 您可以做的是使用skiplimit重新创建批量操作。

For this to work, you will have to define the skip and limit values that you want to use. 为此,您必须定义要使用的跳过和限制值。 If you are going to be updating using a batch size of 100, then the limit will always be 100, but the skip will be increasing by 100 every time you run the query. 如果要使用100的批处理量进行更新,则限制将始终为100,但是每次运行查询时,跳过量将增加100。

Ex, first run. 例如,第一次运行。

db.collection.find({"identifier": {$exists:true}}).sort({_id:1}).skip(0).limit(100)
    .forEach( function(x) {
        db.collection.update({_id: x._id}, {$set: {"identifier": 
        x.identifier.toString()}});
    }
);

Ex, second run. 例如,第二轮。

db.collection.find({"identifier": {$exists:true}}).sort({_id:1}).skip(100).limit(100)
    .forEach( function(x) {
        db.collection.update({_id: x._id}, {$set: {"identifier": 
        x.identifier.toString()}});
    }
);

Ex, third run. 例如,第三轮。

db.collection.find({"identifier": {$exists:true}}).sort({_id:1}).skip(200).limit(100)
    .forEach( function(x) {
        db.collection.update({_id: x._id}, {$set: {"identifier": 
        x.identifier.toString()}});
    }
);

This way you can control what is being done for every batch of size 100. 这样,您可以控制每100个大小为100的批处理。

Remember to ALWAYS sort before skipping and limiting, or else you would have random results in the skip operation. 请记住,在跳过和限制之前,请务必进行排序,否则跳过操作会产生随机结果。 You can sort with whatever criteria you want. 您可以根据需要的任何条件进行排序。

You could also help the process if the find operation filters the results that need to be converted: 如果find操作过滤了需要转换的结果,那么您也可以为该过程提供帮助:

db.collection.find({"identifier": {$exists:true, $not {$type: "string"} }})
.forEach( function(x) {
            db.collection.update({_id: x._id}, {$set: {"identifier": 
            x.identifier.toString()}});
        }
    );

But don't combine both approaches, choose one or the other (because of the results of the find operation). 但是不要将两种方法结合起来,而是选择一种或另一种方法(由于find操作的结果)。

Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM