繁体   English   中英

mongodb 中是否有类似于日志的读取操作?

[英]Is there something similar to journal for read operations in mongodb?

我目前正在开发一个程序,它从 mongo 读取文档并将它们写入文件中......就像这样:

for doc in db.col.find(field=="bla"):
  file.write(doc) 

我的问题是在执行此过程时可能会发生某些事情(完成所有写入需要一周的时间),例如关机或网络问题。 我的问题是:是否有类似于日志的东西用于从检查点恢复的写操作? 所以我不需要重新对文件进行所有写入。

该程序将执行您想要的操作,它将 ID 写入单独的日志文件。 如果一切运行良好,它将正常工作。 如果失败,则 lopfile 将确保您在最后一次写入后开始。 如果数据集发生变化,只要只完成插入,它也可以工作。

对于 fStrings 和pymongo ,它需要 Python 3.6 或更高版本。

import pymongo
import  bson.json_util
import pathlib
import os

log_file = "logfile.txt"
output_file = "zip_codes.json"
host = "mongodb+srv://readonly:readonly@demodata.rgl39.mongodb.net/test?retryWrites=true&w=majority"
log_set = frozenset()

# Do we have a log file of previous writes
if os.path.isfile(log_file):
    with open(log_file, "r") as input:
        log_set = frozenset([x.strip() for x in input.readlines()])
        print(f"{log_file} contains {len(log_set)} items")
else: # lets create one that is empty
    pathlib.Path(log_file).touch()
    print(f"creating {log_file}")

# connect to MongoDB we are using a readonly dataset for testing. 
client = pymongo.MongoClient(host)
db = client["demo"]
zipcodes=db["zipcodes"]

count = 0
# Note we use bson to dump the file rather than json.dumps. This ensures
# that we can read this file back into MongoDB.
with open(output_file, "w") as data_output:
    with open(log_file, "w") as log_output:
        for doc in zipcodes.find():
            if doc["_id"] not in log_set: # did we write this record already
                count = count + 1
                data_output.write(f"{bson.json_util.dumps(doc)}\n")
                log_output.write(f"{doc['_id']}\n")
                print(f"inserted {count} docs")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM