简体   繁体   English

MongoDB-将具有数组的一个记录转换为新集合中的多个记录

[英]MongoDB - Convert One record with an array to multiple records in a new collection

[MongoDB shell or pyMongo] I would like to know how to efficiently convert one record in a collection with an array in one field, to multiple records in say anew collection. [MongoDB shell或pyMongo]我想知道如何有效地将一个字段中具有数组的集合中的一条记录转换为一个新集合中的多个记录。 So far, the only solution, I've been able to achieve is iterating the records one by one and then iterating the array in the field I want and do individual inserts. 到目前为止,我能够实现的唯一解决方案是逐个迭代记录,然后迭代我想要的字段中的数组并进行单个插入。 I'm hoping there's a more efficient way to do this. 我希望有一种更有效的方法。

Example: 例:

I want to take a collection in MongoDB with structure similar to : 我想在MongoDB中采用类似于以下结构的集合:

[{
    "_id": 1,
    "points": ["a", "b", "c"]
}, {
    "_id": 2,
    "points": ["d"]
}]

and convert it to something like this: 并将其转换为如下所示:

[{
    "_id": 1,
    "points": "a"
}, {
    "_id": 2,
    "points": "b"
}, {
    "_id": 3,
    "points": "c"
}, {
    "_id": 4,
    "points": "d"
}]

Assuming you're ok with auto-generated _id values in the new collection, you can do this with an aggregation pipeline that uses $unwind to unwind the points array and $out to output the results to a new collection: 假设您可以在新集合中使用自动生成的_id值,则可以使用聚合管道来执行此操作,该管道使用$unwind来展开points组,并使用$out将结果输出到新集合中:

db.test.aggregate([
    // Duplicate each doc, one per points array element
    {$unwind: '$points'},

    // Remove the _id field to prompt regeneration as there are now duplicates
    {$project: {_id: 0}},

    // Output the resulting docs to a new collection, named 'newtest'
    {$out: 'newtest'}
])

Here's another version which can be expected to perform worse than @JohnnyHK's solution because of a second $unwind and a potentially massive $group but it generates integer IDs based on some order that you can specify in the $sort stage: 这是另一个版本,由于第二个$unwind和一个可能庞大的$group ,其性能可能会比@JohnnyHK的解决方案差,但它会根据您可以在$sort阶段指定的某些顺序生成整数ID:

db.collection.aggregate([{
    // flatten the "points" array to get individual documents
    $unwind: { "path": "$points" },
}, {
    // sort by some criterion
    $sort: { "points": 1 }
}, {
    // throw all sorted "points" in the very same massive array
    $group: {
        _id: null,
        "points": { $push: "$points" },
    }
}, {
    // flatten the massive array making each document's position index its `_id` field
    $unwind: {
        "path": "$points",
        includeArrayIndex: "_id"
    }
} , {
    // write results to new "result" collection
    $out: "result"
}], {
    // make sure we do not run into memory issues
    allowDiskUse: true
})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM