[英]MongoDB - Convert One record with an array to multiple records in a new collection
[MongoDB shell or pyMongo] I would like to know how to efficiently convert one record in a collection with an array in one field, to multiple records in say anew collection. [MongoDB shell或pyMongo]我想知道如何有效地将一个字段中具有数组的集合中的一条记录转换为一个新集合中的多个记录。 So far, the only solution, I've been able to achieve is iterating the records one by one and then iterating the array in the field I want and do individual inserts.
到目前为止,我能够实现的唯一解决方案是逐个迭代记录,然后迭代我想要的字段中的数组并进行单个插入。 I'm hoping there's a more efficient way to do this.
我希望有一种更有效的方法。
Example: 例:
I want to take a collection in MongoDB with structure similar to : 我想在MongoDB中采用类似于以下结构的集合:
[{
"_id": 1,
"points": ["a", "b", "c"]
}, {
"_id": 2,
"points": ["d"]
}]
and convert it to something like this: 并将其转换为如下所示:
[{
"_id": 1,
"points": "a"
}, {
"_id": 2,
"points": "b"
}, {
"_id": 3,
"points": "c"
}, {
"_id": 4,
"points": "d"
}]
Assuming you're ok with auto-generated _id
values in the new collection, you can do this with an aggregation pipeline that uses $unwind
to unwind the points
array and $out
to output the results to a new collection: 假设您可以在新集合中使用自动生成的
_id
值,则可以使用聚合管道来执行此操作,该管道使用$unwind
来展开points
组,并使用$out
将结果输出到新集合中:
db.test.aggregate([
// Duplicate each doc, one per points array element
{$unwind: '$points'},
// Remove the _id field to prompt regeneration as there are now duplicates
{$project: {_id: 0}},
// Output the resulting docs to a new collection, named 'newtest'
{$out: 'newtest'}
])
Here's another version which can be expected to perform worse than @JohnnyHK's solution because of a second $unwind
and a potentially massive $group
but it generates integer IDs based on some order that you can specify in the $sort
stage: 这是另一个版本,由于第二个
$unwind
和一个可能庞大的$group
,其性能可能会比@JohnnyHK的解决方案差,但它会根据您可以在$sort
阶段指定的某些顺序生成整数ID:
db.collection.aggregate([{
// flatten the "points" array to get individual documents
$unwind: { "path": "$points" },
}, {
// sort by some criterion
$sort: { "points": 1 }
}, {
// throw all sorted "points" in the very same massive array
$group: {
_id: null,
"points": { $push: "$points" },
}
}, {
// flatten the massive array making each document's position index its `_id` field
$unwind: {
"path": "$points",
includeArrayIndex: "_id"
}
} , {
// write results to new "result" collection
$out: "result"
}], {
// make sure we do not run into memory issues
allowDiskUse: true
})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.