[英]Bulk update MongoDB Collection from a CSV file
我有一个 mongoose 架构定义为
const masterSchema = new mongoose.Schema({
chapter: { type: Number, required: true },
line: { type: Number, required: true },
translations: [
{
translation: { type: String, required: true },
},
],
});
我正在尝试从 CSV 文件更新集合。 该集合有超过 5000 份文档。
样本数据
[
{
chapter: 1,
line: 1,
translations: [
{
translation: "xyz",
},
],
},
{
chapter: 1,
line: 2,
translations: [
{
translation: "abc",
},
],
},
];
CSV 文件的格式为
chapter,line,translation
1,1,example1
1,2,example2
....
output 应该是
[
{
chapter: 1,
line: 1,
translations: [
{
translation: "xyz",
},
{
translation : "example1"
}
],
},
{
chapter: 1,
line: 2,
translations: [
{
translation: "abc",
},
{
translation : "example2"
}
],
},
]
我对如何使用updateMany()将数据插入正确的文档感到困惑。 (如果它是解决问题的正确方法)
假设chapter
+ line
不是唯一的,那么这个练习最难的部分首先是解析 CSV。 CSV 往往会给您带来惊喜,例如引用的材料、已分析列中意外的前导和尾随空格等,因此最好围绕它使用一些编程来控制它,例如
import csv
from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017')
db = client['testX']
coll = db['foo']
with open('myFile', 'r') as csvfile:
reader = csv.reader(csvfile)
header = next(reader, None) # capture and skip header line
for row in reader:
print(row) # for fun
# Careful: must turn parsed strings into int to match database type.
# Also, strip whitespace from col 2 for safety:
coll.update_many({'chapter':int(row[0]),'line':int(row[1])},
{'$push': {'translations':{'translation':row[2].strip()}}})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.