简体   繁体   中英

Degrading performance of mongoldb document updates as record grows

I have an iOS application which sends batches of data to an API endpoint which stores the data into a mongodb database. My data is modeled like:

{
"_id" : ObjectId,
"device_id" : Uuid,
"rtfb_status": bool,
"repetitions" : [
    {
        "session_id" : Uuid,
        "set_id" : Uuid,
        "level" : String,
        "exercise" : String,
        "number" : i32,
        "rom" : f64,
        "duration" : f64,
        "time" : i64
    },
    ...,
],
"imu_data": [
    {
        "session_id": Uuid,
        "data": [
            {
                "acc" : {
                    "y" : f64,
                    "z" : f64,
                    "x" : f64,
                    "time" : i64,
                },
                "gyro" : {
                    "y" : f64,
                    "z" : f64,
                    "x" : f64,
                    "time" : i64,
                }
            },
            ...,
        ]
    },
    ...,
]
}

My application just appends to the relevant array.

async fn append_to_list<S: Serialize + From<I>, I>(
    self,
    collection: Collection,
    source: I,
    field_name: &str,
) -> Result<i64, CollectionError> {
    let new_records =
        bson::to_bson(&S::from(source)).map_err(CollectionError::DbSerializationError)?;

    self.update_user_record(
        collection,
        bson::doc! { "$push": { field_name: new_records } },
    )
    .await
}

async fn update_user_record(
    self,
    collection: Collection,
    document: bson::Document,
) -> Result<i64, CollectionError> {
    let query = self.try_into()?;

    let update_options = mongodb::options::UpdateOptions::builder()
        .upsert(true)
        .build();

    let updated_res = collection
        .update_one(query, document, update_options)
        .await
        .map_err(CollectionError::DbError)?;

    Ok(updated_res.modified_count)
}

pub async fn add_imu_records(
    self,
    collection: Collection,
    imurecords: JsonImuRecordSet,
) -> Result<i64, CollectionError > {
    self.append_to_list::<ImuDataUpdate, _>(collection, imurecords, "imu_data")
        .await
}

Everything is working , but write performance drops off as time goes on. From the logger output of my application:

With small records

 INFO  data_server > 127.0.0.1:50789 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 16.78034ms
 INFO  data_server > 127.0.0.1:50816 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 7.737755ms
 INFO  data_server > 127.0.0.1:50817 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 7.143721ms
 INFO  data_server > 127.0.0.1:50789 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 5.021643ms
 INFO  data_server > 127.0.0.1:50818 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 7.644989ms
 INFO  data_server > 127.0.0.1:50816 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 4.456604ms
 INFO  data_server > 127.0.0.1:50817 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 2.822192ms
 INFO  data_server > 127.0.0.1:50789 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 1.820112ms
 INFO  data_server > 127.0.0.1:50818 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 1.850234ms
 INFO  data_server > 127.0.0.1:50816 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 1.801561ms
 INFO  data_server > 127.0.0.1:50789 "PUT /v1/add_imu_records HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 26.722725ms

Note: The add_imu_records call is a much larger payload so I expect it to run longer.

But after a relatively short duration (maybe 10 minutes or so) the writes are taking MUCH longer

After ~10 mins of data

INFO  data_server > 127.0.0.1:50816 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 23.000502ms
INFO  data_server > 127.0.0.1:50818 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 23.23503ms
INFO  data_server > 127.0.0.1:50789 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 114.679434ms
INFO  data_server > 127.0.0.1:50817 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 143.392153ms
INFO  data_server > 127.0.0.1:50816 "PUT /v1/add_repetition HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 65.101141ms
INFO  data_server > 127.0.0.1:50818 "PUT /v1/add_imu_records HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 117.456596ms

Am I doing something wrong? Is mongodb just the wrong tool and I should use an RDBMS? I have a branch of this that runs on Postgres, and the response times are slower than the mongo times at their best, but they remain pretty stable.

Postgres based server log

INFO  data_server > 172.17.0.1:54918 "PUT /v1/add_repetition HTTP/1.1" 201 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 7.300945ms
 INFO  data_server > 172.17.0.1:54906 "PUT /v1/add_repetition HTTP/1.1" 201 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 5.927394ms
 INFO  data_server > 172.17.0.1:54910 "PUT /v1/add_repetition HTTP/1.1" 201 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 6.025674ms
 INFO  data_server > 172.17.0.1:54914 "PUT /v1/add_imu_records HTTP/1.1" 200 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 45.430983ms
 INFO  data_server > 172.17.0.1:54906 "PUT /v1/add_repetition HTTP/1.1" 201 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 11.442257ms
 INFO  data_server > 172.17.0.1:54910 "PUT /v1/add_repetition HTTP/1.1" 201 "-" "client/2.0 (edu.odu.cs.nightly; build:385; iOS 13.7.0) Alamofire/5.2.1" 6.875235ms

Mongo is running on my machine in a docker container. According to Object.bsonsize the document is 4484480 bytes (4.48448 Mb).

In order to update a document, MongoDB must fetch the entire document from disk (unless it is already in the cache), then mutate it in memory, and write it back to the disk. The modifications are also written to the oplog for replication to the secondary nodes.

As the document size grows, this will take longer for each document, and since each document consumes increasing space in memory, cache churn will also begin to eat away performance for unrelated queries.

The maximum document size in MongoDB is 16MB. If this document is already at 4MB after 10 minutes, it will need to be split or cutoff soon.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM