简体   繁体   中英

pymongo db.collection.update operationFailure

I have a large collection of documents which I'm trying to update using the pymongo.update function. I am finding all documents that fall within a certain polygon and updating all the points found with "update_value".

for element in geomShapeCollection:
    db.collectionName.update({"coordinates":{"$geoWithin":{"$geometry":element["geometry_part"]}}}, {"$set":{"Update_key": update_value}}, multi = True, timeout=False)

For smaller collections this command works as expected. In the largest dataset the command works for 70-80% of the data and then throws the error:

pymongo.errors.OperationFailure: cursor id '428737620678732339' not valid at server

The pymongo documentation tells me that this is possibly due to a timeout issue.

Cursors in MongoDB can timeout on the server if they've been open for a long time without any operations being performed on them.

Reading through the pymongo documentation, the find() function has a boolean flag for timeout.

find(spec=None, fields=None, skip=0, limit=0, timeout=True, snapshot=False, tailable=False, _sock=None, _must_use_master=False,_is_command=False)

However the update function appears not to have this:

update(spec, document, upsert=False, manipulate=False, safe=False, multi=False)

Is there any way to set this timeout flag for the update function? Is there any way I can change this so that I do not get this OperationFailure error? Am I correct in assuming this is an timeout error as pymongo states that it throws this error when

Raised when a database operation fails.

After some research and lots of experimentation I found that it was the outer loop cursor that was causing the error.

for element in geomShapeCollection:

geomShapeCollection is a cursor to a mongodb collection. There are several elements in geoShapeCollection where large amounts of elements fall, because these updates take such a considerable amount of time the geomShapeCollection cursor closes.

The problem was not with the update function at all. Adding a (timeout=False) to the outer cursor solves this problem.

for element in db.geomShapeCollectionName.find(timeout=False):
    db.collectionName.update({"coordinates":{"$geoWithin":{"$geometry":element["geometry_part"]}}}, {"$set":{"Update_key": update_value}}, multi = True, timeout=False)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM