简体   繁体   中英

Poor Write perfomance with MongoDB 5.0.8 in a PSA (Primary-Secondary-Arbiter) setup

I have some write performance struggle with MongoDB 5.0.8 in an PSA (Primary-Secondary-Arbiter) deployment when one data bearing member goes down.

I am aware of the " Mitigate Performance Issues with PSA Replica Set " page and the procedure to temporarily work around this issue.

However, in my opinion, the manual intervention described here should not be necessary during operation. So what can I do to ensure that the system continues to run efficiently even if a node fails? In other words, as in MongoDB 4.x with the option "enableMajorityReadConcern=false".

As I understand the problem has something to do with the defaultRWConcern. When configuring a PSA Replica Set in MongoDB you are forced to set the DefaultRWConcern. Otherwise the following message will appear when rs.addArb is called:

MongoServerError: Reconfig attempted to install a config that would change the implicit default write concern. Use the setDefaultRWConcern command to set a cluster-wide write concern and try the reconfig again.

So I did

db.adminCommand({
    "setDefaultRWConcern": 1,
    "defaultWriteConcern": {
        "w": 1
    },
    "defaultReadConcern": {
        "level": "local"
    }
})

I would expect that this configuration causes no lag when reading/writing to a PSA System with only one data bearing node available.

But I observe "slow query" messages in the mongod log like this one:

{
    "t": {
        "$date": "2022-05-13T10:21:41.297+02:00"
    },
    "s": "I",
    "c": "COMMAND",
    "id": 51803,
    "ctx": "conn149",
    "msg": "Slow query",
    "attr": {
        "type": "command",
        "ns": "<db>.<col>",
        "command": {
            "insert": "<col>",
            "ordered": true,
            "txnNumber": 4889253,
            "$db": "<db>",
            "$clusterTime": {
                "clusterTime": {
                    "$timestamp": {
                        "t": 1652430100,
                        "i": 86
                    }
                },
                "signature": {
                    "hash": {
                        "$binary": {
                            "base64": "bEs41U6TJk/EDoSQwfzzerjx2E0=",
                            "subType": "0"
                        }
                    },
                    "keyId": 7096095617276968965
                }
            },
            "lsid": {
                "id": {
                    "$uuid": "25659dc5-a50a-4f9d-a197-73b3c9e6e556"
                }
            }
        },
        "ninserted": 1,
        "keysInserted": 3,
        "numYields": 0,
        "reslen": 230,
        "locks": {
            "ParallelBatchWriterMode": {
                "acquireCount": {
                    "r": 2
                }
            },
            "ReplicationStateTransition": {
                "acquireCount": {
                    "w": 3
                }
            },
            "Global": {
                "acquireCount": {
                    "w": 2
                }
            },
            "Database": {
                "acquireCount": {
                    "w": 2
                }
            },
            "Collection": {
                "acquireCount": {
                    "w": 2
                }
            },
            "Mutex": {
                "acquireCount": {
                    "r": 2
                }
            }
        },
        "flowControl": {
            "acquireCount": 1,
            "acquireWaitCount": 1,
            "timeAcquiringMicros": 982988
        },
        "readConcern": {
            "level": "local",
            "provenance": "implicitDefault"
        },
        "writeConcern": {
            "w": 1,
            "wtimeout": 0,
            "provenance": "customDefault"
        },
        "storage": {},
        "remote": "10.10.7.12:34258",
        "protocol": "op_msg",
        "durationMillis": 983
    }

The collection involved here is under proper load with about 1000 reads and 1000 writes per second from different (concurrent) clients.

MongoDB 4.x with "enableMajorityReadConcern=false" performed "normal" here and I have not noticed any loss of performance in my application. MongoDB 5.x doesn't manage that and in my application data is piling up that I can't get written away in a performant way.

So my question is, if I can get the MongoDB 4.x behaviour back. A write guarantee from the single data bearing node which is available in the failure scenario would be OK for me. But in a failure scenario, having to manually reconfigure the faulty node should actually be avoided.

Thanks for any advice!

At the end we changed the setup to a PSS layout. This was also recommended in the MongoDB Community Forum .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM