简体   繁体   中英

MongoDB - Convert One record with an array to multiple records in a new collection

[MongoDB shell or pyMongo] I would like to know how to efficiently convert one record in a collection with an array in one field, to multiple records in say anew collection. So far, the only solution, I've been able to achieve is iterating the records one by one and then iterating the array in the field I want and do individual inserts. I'm hoping there's a more efficient way to do this.

Example:

I want to take a collection in MongoDB with structure similar to :

[{
    "_id": 1,
    "points": ["a", "b", "c"]
}, {
    "_id": 2,
    "points": ["d"]
}]

and convert it to something like this:

[{
    "_id": 1,
    "points": "a"
}, {
    "_id": 2,
    "points": "b"
}, {
    "_id": 3,
    "points": "c"
}, {
    "_id": 4,
    "points": "d"
}]

Assuming you're ok with auto-generated _id values in the new collection, you can do this with an aggregation pipeline that uses $unwind to unwind the points array and $out to output the results to a new collection:

db.test.aggregate([
    // Duplicate each doc, one per points array element
    {$unwind: '$points'},

    // Remove the _id field to prompt regeneration as there are now duplicates
    {$project: {_id: 0}},

    // Output the resulting docs to a new collection, named 'newtest'
    {$out: 'newtest'}
])

Here's another version which can be expected to perform worse than @JohnnyHK's solution because of a second $unwind and a potentially massive $group but it generates integer IDs based on some order that you can specify in the $sort stage:

db.collection.aggregate([{
    // flatten the "points" array to get individual documents
    $unwind: { "path": "$points" },
}, {
    // sort by some criterion
    $sort: { "points": 1 }
}, {
    // throw all sorted "points" in the very same massive array
    $group: {
        _id: null,
        "points": { $push: "$points" },
    }
}, {
    // flatten the massive array making each document's position index its `_id` field
    $unwind: {
        "path": "$points",
        includeArrayIndex: "_id"
    }
} , {
    // write results to new "result" collection
    $out: "result"
}], {
    // make sure we do not run into memory issues
    allowDiskUse: true
})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM