简体   繁体   English

使用 Pymongo Upsert 在 MongoDB 中使用 Python 更新或创建文档

[英]Using Pymongo Upsert to Update or Create a Document in MongoDB using Python

I have a dataframe that contains data I want to upload into MongoDB.我有一个 dataframe,其中包含我要上传到 MongoDB 的数据。 Below is the data:下面是数据:

    MongoRow = pd.DataFrame.from_dict({'school': {1: schoolID}, 'student': {1: student}, 'date': {1: dateToday}, 'Probability': {1: probabilityOfLowerThanThreshold}})

                     school                   student        date  Probability
1  5beee5678d62101c9c4e7dbb  5bf3e06f9a892068705d8420  2020-03-27     0.000038

I have the following code which checks if a row in mongo contains the same student ID and date, if it doesn't then it adds the row:我有以下代码检查 mongo 中的一行是否包含相同的学生 ID 和日期,如果没有,则添加该行:

def getPredictions(school):
    schoolDB = DB[school['database']['name']]
    schoolPredictions = schoolDB['session_attendance_predicted']
    Predictions = schoolPredictions.aggregate([{
        '$project': {
            'school': '$school',
            'student':'$student',
            'date':'$date'
        }        
    }])
    return list(Predictions)
Predictions = getPredictions(school)
Predictions = pd.DataFrame(Predictions)

schoolDB = DB[school['database']['name']]
collection = schoolDB['session_attendance_predicted']
import json

for i in Predictions.index:
    schoolOld = Predictions.loc[i,'school']
    studentOld = Predictions.loc[i,'student']
    dateOld = Predictions.loc[i,'date']
    if(studentOld == student and date == dateOld):
        print("Student Exists")
        #UPDATE THE ROW WITH NEW VALUES
    else:
        print("Student Doesn't Exist")
        records = json.loads(df.T.to_json()).values()
        collection.insert(records)

However if it does exist, I want it to update the row with the new values.但是,如果它确实存在,我希望它用新值更新该行。 Does anyone know how to do this?有谁知道如何做到这一点? I have looked at pymongo upsert but I'm not sure how to use it.我看过 pymongo upsert 但我不知道如何使用它。 Can anyone help?任何人都可以帮忙吗?

'''''''UPDATE''''''' '''''''更新'''''''

The above is partly working now, however, I am now getting an error with the following code:上面的部分现在可以工作了,但是,我现在收到以下代码的错误:

dateToday = datetime.datetime.combine(dateToday, datetime.time(0, 0))

MongoRow = pd.DataFrame.from_dict({'school': {1: schoolID}, 'student': {1: student}, 'date': {1: dateToday}, 'Probability': {1: probabilityOfLowerThanThreshold}})
data_dict = MongoRow.to_dict()

for i in Predictions.index:
    print(Predictions)
    collection.replace_one({'student': student, 'date': dateToday}, data_dict, upsert=True)

Error:错误:

InvalidDocument: documents must have only string keys, key was 1

To upsert you cannot use insert() (deprecated) insert_one() or insert_many() .要更新插入,您不能使用insert() (已弃用) insert_one()insert_many() You must use one of the collection level operators that supports upserting.您必须使用支持更新插入的集合级别运算符之一。

To get started I would point you towards reading the dataframe line by line and using replace_one() on each line.首先,我将指导您逐行读取数据帧并在每行上使用replace_one() There are more advanced ways of doing this but this is the easiest.有更高级的方法可以做到这一点,但这是最简单的。

Your code will look a bit like:你的代码看起来有点像:

collection.replace_one({'Student': student, 'Date': date}, record, upsert=True)

Probably a number of people are going to be confused by the accepted answer as it suggests using replace_one with the upsert flag.可能很多人会对接受的答案感到困惑,因为它建议使用带有upsert标志的replace_one

Upserting means 'Updated or Insert' (Up = update and sert= insert). Upserting 表示“更新或插入”(Up = update 和 sert= insert)。 For most people looking to 'upsert', they should be using update_one with the upsert flag.对于大多数想要“更新插入”的人来说,他们应该使用带有upsert标志的update_one

For example:例如:

collection.update_one({'matchable_field': field_data_to_match}, {"$set": upsertable_data}, upsert=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM