[英]MongoDB: How to insert document in collection that exists in other collection as well?
I have two collections EN_PR2019
and EN_PR2018
.我有两个 collections
EN_PR2019
和EN_PR2018
。 They mosty contain the same things but from different years.它们大多数包含相同的东西,但来自不同的年份。 After inserting all the documents into
EN_PR2019
I'm trying to insert documents that may have the same _id
as in collection EN_PR2019
.将所有文档插入
EN_PR2019
后,我尝试插入可能与集合EN_PR2019
具有相同_id
的文档。 I read that I needed to create a index for the collection to be able to have records with the same _id
in two different collections.我读到我需要为集合创建一个索引,以便能够在两个不同的 collections 中拥有具有相同
_id
的记录。 Right now I'm getting pymongo.errors.DuplicateKeyError: E11000 duplicate key error collection: Database.EN_PR2018 index: id_1 dup key: { id: null }
.现在我得到
pymongo.errors.DuplicateKeyError: E11000 duplicate key error collection: Database.EN_PR2018 index: id_1 dup key: { id: null }
。
How do I insert the same record, having the same _id
in two different collections without raising errors or having to deal with duplicates?如何插入相同的记录,在两个不同的 collections 中具有相同的
_id
而不会引发错误或不必处理重复项?
def check_record(collection, record_id):
"""Check if record exists in collection
Args:
record_id (str): record _id as in collection
"""
return collection.find_one({'id': record_id})
def collection_index(collection, index):
"""Checks if index exists for collection,
and return a new index if not
Args:
collection (str): Name of collection in database
index (str): Dict key to be used as an index
"""
if index not in collection.index_information():
return collection.create_index([(index, pymongo.ASCENDING)], unique=True)
def push_upstream(collection, record_id, record):
"""Update record in collection
Args:
collection (str): Name of collection in database
record_id (str): record _id to be put for record in collection
record (dict): Data to be pushed in collection
"""
return collection.insert_one({"_id": record_id}, {"$set": record})
def update_upstream(collection, record_id, record):
"""Update record in collection
Args:
collection (str): Name of collection in database
record_id (str): record _id as in collection
record (dict): Data to be updated in collection
"""
return collection.update_one({"_id": record_id}, {"$set": record}, upsert=True)
def executePushPlayer(db):
playerstats = load_file(db.playerfile)
collection = db.DATABASE[db.league + db.season]
collection_index(collection, 'id')
for player in playerstats:
existingPost = check_record(collection, player['id'])
if existingPost:
update_upstream(collection, player['id'], player)
else:
push_upstream(collection, player['id'], player)
if __name__ == '__main__':
test = DB('EN_PR', '2018')
executePushPlayer(test)
The _id
field in every document inserted into a MongoDB database is special because the _id
field always indexed and the index is a unique index .插入 MongoDB 数据库的每个文档中的
_id
字段是特殊的,因为_id
字段总是被索引并且索引是唯一索引。 It is perfectly reasonable to use the _id
fields from one collection in another as long the uniqueness constraint is not breached in the new collection.只要在新集合中没有违反唯一性约束,在另一个集合中使用一个集合中的
_id
字段是完全合理的。
From the error I would guess that several of your player["_id"]
value are null.从错误中我猜你的几个
player["_id"]
值是 null。 That points to some problems in your load_file
project.这表明您的
load_file
项目中存在一些问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.