簡體   English   中英

從 CSV 文件批量更新 MongoDB 集合

[英]Bulk update MongoDB Collection from a CSV file

我有一個 mongoose 架構定義為

const masterSchema = new mongoose.Schema({
  chapter: { type: Number, required: true },
  line: { type: Number, required: true },
  translations: [
    {
      translation: { type: String, required: true },
    },
  ],
});

我正在嘗試從 CSV 文件更新集合。 該集合有超過 5000 份文檔。

樣本數據

[
  {
    chapter: 1,
    line: 1,
    translations: [
      {
        translation: "xyz",
      },
    ],
  },
  {
    chapter: 1,
    line: 2,
    translations: [
      {
        translation: "abc",
      },
    ],
  },
];

CSV 文件的格式為

chapter,line,translation
1,1,example1
1,2,example2
....

output 應該是

[
  {
    chapter: 1,
    line: 1,
    translations: [
      {
        translation: "xyz",
      },
      {
        translation : "example1"
      }
    ],
  },
  {
    chapter: 1,
    line: 2,
    translations: [
      {
        translation: "abc",
      },
      {
        translation : "example2"
      }
    ],
  },
]

我對如何使用updateMany()將數據插入正確的文檔感到困惑。 如果它是解決問題的正確方法

假設chapter + line不是唯一的,那么這個練習最難的部分首先是解析 CSV。 CSV 往往會給您帶來驚喜,例如引用的材料、已分析列中意外的前導和尾隨空格等,因此最好圍繞它使用一些編程來控制它,例如

import csv
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017')
db = client['testX']
coll = db['foo']

with open('myFile', 'r') as csvfile:
    reader = csv.reader(csvfile)
    header = next(reader, None)  # capture and skip header line
    for row in reader:
        print(row)  # for fun
        # Careful: must turn parsed strings into int to match database type.
        # Also, strip whitespace from col 2 for safety:
        coll.update_many({'chapter':int(row[0]),'line':int(row[1])},
                         {'$push': {'translations':{'translation':row[2].strip()}}})

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM