[英]Insert to MongoDB collection that has unique key with Python
I have a collection called englishWords, and the unique index is the "word" field. 我有一个名为englishWords的集合,唯一索引是“word”字段。 When I do this
当我这样做
from pymongo import MongoClient
tasovshik = MongoClient()
db = tasovshik.tongler
coll = db.englishWords
f = open('book.txt')
for word in f.read().split():
coll.insert( { "word": word } } )
I get this error message 我收到此错误消息
pymongo.errors.DuplicateKeyError: E11000 duplicate key error index: tongler.englishWords.$word_1 dup key: { : "Harry" }, but it stops to insert when the first existing word is to be inserted.
I do not want to implement the check of existence, I want to use the benefits of unique index with no problems. 我不想实现检查存在,我想使用唯一索引的好处没有问题。
To avoid unnecessary exception handling, you could do an upsert: 为避免不必要的异常处理,您可以执行upsert:
from pymongo import MongoClient
tasovshik = MongoClient()
db = tasovshik.tongler
coll = db.englishWords
for word in f.read().split():
coll.replace_one({'word': word}, {'word': word}, True)
The last argument specifies that MongoDB should insert the value if it does not already exist. 最后一个参数指定MongoDB应该插入值,如果它尚不存在。
Here's the documentation . 这是文档 。
EDIT : For even faster performances for a long list of words, you could do it in bulk like this: 编辑 :对于一长串单词更快的表现,你可以这样批量做:
from pymongo import MongoClient
tasovshik = MongoClient()
db = tasovshik.tongler
coll = db.englishWords
bulkop = coll.initialize_unordered_bulk_op()
for word in f.read().split():
bulkop.find({'word':word}).upsert()
bulkop.execute()
Taken from bulk operations documentation 取自批量操作文档
You could do the following: 您可以执行以下操作:
for word in f.read().split():
try:
coll.insert( { "word": word } } )
except pymongo.errors.DuplicateKeyError:
continue
This will ignore errors. 这将忽略错误。
And also, did you drop the collection before trying? 而且,你在尝试之前放弃了这个系列吗?
I've just run your code and everything looks good except that you have an extra }
at the last line. 我只是运行你的代码,除了你在最后一行有一个额外的
}
之外,一切看起来都很好。 Delete that, and you don't have the drop any collection. 删除它,你没有删除任何集合。 Every
insert
, creates it's own batch of data, so there is no need for dropping the previous collection. 每个
insert
都会创建自己的一批数据,因此不需要删除以前的集合。
Well, error msg indicates that the key Harry
is already inserted and you are trying to insert again with the same key. 好吧,错误消息msg表示已插入密钥
Harry
并且您正尝试使用相同的密钥重新插入。 Looks like this in not your entire code? 看起来这不是你的整个代码吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.