I have a dataframe of almost 120000 records as follows. Also I have a mongoDB collection which looks exacly same as below dataframe
ItemID ParentID ItemRating ItemPrice Qty
A1 ItemA1 0 12 100
A2 ItemA2 0 15 200
B1 ItemB1 0 20 300
B2 ItemB2 0 25 400
B3 ItemB3 0 30 150
Now, I want update and Insert record from my dataframe into mongo collection with following condition
I know this can be done with PyMongo update_many method by setting upsert=true. but I am not sure how can I do that ? how should I write my filter condition ?
Regards Vipul
You won't be able to use update_many()
as that takes a single filter criteria which in your case won't work. What you need is replace_one()
in a loop with upsert=true. Something like:
from pymongo import MongoClient
import pandas as pd
db = MongoClient('localhost', 27019)['testdatabase1']
df = pd.DataFrame({'ItemID':['A1','A2','B1','B2','B3'],
'ParentID':['ItemA1','ItemA2','ItemB1','ItemB2','ItemB3'],
'ItemRating ': [0,0,0,0,0],
'ItemPrice ': [12,15,20,25,30],
'Qty': [100,200,300,400,150]
})
for row in df.iterrows():
record = row[1].to_dict()
result = db.testcollection.replace_one({'ItemId': record.get('ItemId'), 'ParentID': record.get('ParentID')}, record, upsert=True)
print(f'{"Replaced: " if result.modified_count == 1 else ""}{"Inserted: " if result.upserted_id is not None else ""} {record}')
Since you are dealing with a lot of data, you probably want to condense it into a single database transaction using bulk_write()
to execute a list of ReplaceOne
operations that you have compiled with your dataframe criteria. See PyMongo: Bulk Write Operations .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.