MongoDB Concurrency Bottleneck

Question

Too Long; Didn't Read

The question is about a concurrency bottleneck I am experiencing on MongoDB. If I make one query, it takes 1 unit of time to return; if I make 2 concurrent queries, both take 2 units of time to return; generally, if I make n concurrent queries, all of them take n units of time to return. My question is about what can be done to improve Mongo's response times when faced with concurrent queries.

The Setup

I have a m3.medium instance on AWS running a MongoDB 2.6.7 server. A m3.medium has 1 vCPU (1 core of a Xeon E5-2670 v2), 3.75GB and a 4GB SSD.

I have a database with a single collection named user_products . A document in this collection has the following structure:

{ user: <int>, product: <int> }

There are 1000 users and 1000 products and there's a document for every user-product pair, totalizing a million documents.

The collection has an index { user: 1, product: 1 } and my results below are all indexOnly .

The Test

The test was executed from the same machine where MongoDB is running. I am using the benchRun function provided with Mongo. During the tests, no other accesses to MongoDB were being made and the tests only comprise read operations.

For each test, a number of concurrent clients is simulated, each of them making a single query as many times as possible until the test is over. Each test runs for 10 seconds. The concurrency is tested in powers of 2, from 1 to 128 simultaneous clients.

The command to run the tests:

mongo bench.js

Here's the full script (bench.js):

var
    seconds = 10,
    limit = 1000,
    USER_COUNT = 1000,
    concurrency,
    savedTime,
    res,
    timediff,
    ops,
    results,
    docsPerSecond,
    latencyRatio,
    currentLatency,
    previousLatency;

ops = [
    {
        op : "find" ,
        ns : "test_user_products.user_products" ,
        query : {
            user : { "#RAND_INT" : [ 0 , USER_COUNT - 1 ] }
        },
        limit: limit,
        fields: { _id: 0, user: 1, product: 1 }
    }
];

for (concurrency = 1; concurrency <= 128; concurrency *= 2) {

    savedTime = new Date();

    res = benchRun({
        parallel: concurrency,
        host: "localhost",
        seconds: seconds,
        ops: ops
    });

    timediff = new Date() - savedTime;

    docsPerSecond = res.query * limit;
    currentLatency = res.queryLatencyAverageMicros / 1000;

    if (previousLatency) {
        latencyRatio = currentLatency / previousLatency;
    }

    results = [
        savedTime.getFullYear() + '-' + (savedTime.getMonth() + 1).toFixed(2) + '-' + savedTime.getDate().toFixed(2),
        savedTime.getHours().toFixed(2) + ':' + savedTime.getMinutes().toFixed(2),
        concurrency,
        res.query,
        currentLatency,
        timediff / 1000,
        seconds,
        docsPerSecond,
        latencyRatio
    ];

    previousLatency = currentLatency;

    print(results.join('\t'));
}

Results

Results are always looking like this (some columns of the output were omitted to facilitate understanding):

concurrency  queries/sec  avg latency (ms)  latency ratio
1            459.6        2.153609008       -
2            460.4        4.319577324       2.005738882
4            457.7        8.670418178       2.007237636
8            455.3        17.4266174        2.00989353
16           450.6        35.55693474       2.040380754
32           429          74.50149883       2.09527338
64           419.2        153.7325095       2.063482104
128          403.1        325.2151235       2.115460969

If only 1 client is active, it is capable of doing about 460 queries per second over the 10 second test. The average response time for a query is about 2 ms.

When 2 clients are concurrently sending queries, the query throughput maintains at about 460 queries per second, showing that Mongo hasn't increased its response throughput. The average latency, on the other hand, literally doubled.

For 4 clients, the pattern continues. Same query throughput, average latency doubles in relation to 2 clients running. The column latency ratio is the ratio between the current and previous test's average latency. See that it always shows the latency doubling.

Update: More CPU Power

I decided to test with different instance types, varying the number of vCPUs and the amount of available RAM. The purpose is to see what happens when you add more CPU power. Instance types tested:

Type        vCPUs  RAM(GB)
m3.medium   1      3.75
m3.large    2      7.5
m3.xlarge   4      15
m3.2xlarge  8      30

Here are the results:

每秒查询数

查询延迟

m3.medium

concurrency  queries/sec  avg latency (ms)  latency ratio
1            459.6        2.153609008       -
2            460.4        4.319577324       2.005738882
4            457.7        8.670418178       2.007237636
8            455.3        17.4266174        2.00989353
16           450.6        35.55693474       2.040380754
32           429          74.50149883       2.09527338
64           419.2        153.7325095       2.063482104
128          403.1        325.2151235       2.115460969

m3.large

concurrency  queries/sec  avg latency (ms)  latency ratio
1            855.5        1.15582069        -
2            947          2.093453854       1.811227185
4            961          4.13864589        1.976946318
8            958.5        8.306435055       2.007041742
16           954.8        16.72530889       2.013536347
32           936.3        34.17121062       2.043083977
64           927.9        69.09198599       2.021935563
128          896.2        143.3052382       2.074122435

m3.xlarge

concurrency  queries/sec  avg latency (ms)  latency ratio
1            807.5        1.226082735       -
2            1529.9       1.294211452       1.055566166
4            1810.5       2.191730848       1.693487447
8            1816.5       4.368602642       1.993220402
16           1805.3       8.791969257       2.01253581
32           1770         17.97939718       2.044979532
64           1759.2       36.2891598        2.018374668
128          1720.7       74.56586511       2.054769676

m3.2xlarge

concurrency  queries/sec  avg latency (ms)  latency ratio
1            836.6        1.185045183       -
2            1585.3       1.250742872       1.055438974
4            2786.4       1.422254414       1.13712774
8            3524.3       2.250554777       1.58238551
16           3536.1       4.489283844       1.994745425
32           3490.7       9.121144097       2.031759277
64           3527         18.14225682       1.989033023
128          3492.9       36.9044113        2.034168718

Starting with the xlarge type, we begin to see it finally handling 2 concurrent queries while keeping the query latency virtually the same (1.29 ms). It doesn't last too long, though, and for 4 clients it again doubles the average latency.

With the 2xlarge type, Mongo is able to keep handling up to 4 concurrent clients without raising the average latency too much. After that, it starts to double again.

The question is: what could be done to improve Mongo's response times with respect to the concurrent queries being made? I expected to see a rise in the query throughput and I did not expect to see it doubling the average latency. It clearly shows Mongo is not being able to parallelize the queries that are arriving.

There's some kind of bottleneck somewhere limiting Mongo, but it certainly doesn't help to keep adding up more CPU power, since the cost will be prohibitive. I don't think memory is an issue here, since my entire test database fits in RAM easily. Is there something else I could try?

Answer 1

You're using a server with 1 core and you're using benchRun. From the benchRun page :

This benchRun command is designed as a QA baseline performance measurement tool; it is not designed to be a "benchmark".

The scaling of the latency with the concurrency numbers is suspiciously exact. Are you sure the calculation is correct? I could believe that the ops/sec/runner was staying the same, with the latency/op also staying the same, as the number of runners grew - and then if you added all the latencies, you would see results like yours.

MongoDB Concurrency Bottleneck

Question

Too Long; Didn't Read

The Setup

The Test

Results

Update: More CPU Power

1 answers

solution1
0 2015-04-07 18:50:36

MongoDB Concurrency Bottleneck

Question

Too Long; Didn't Read

The Setup

The Test

Results

Update: More CPU Power

1 answers

solution1 0 2015-04-07 18:50:36

solution1
0 2015-04-07 18:50:36