简体   繁体   中英

How to set up MongoDB Sharded Cluster to Benchmark it against Cassandra?

I have a Cassandra Cluster set up with 5 C* nodes in the ring. All data is replicated on 1 node. Data is partitioned across the nodes.

I used the MongoDB manual for setting up a Sharded Cluster but I feel like if I need more nodes for configsvr and the App Server(s), this is not an ideal setup to benchmark it against the Cassandra cluster.

  • How can I set up a MongoDB sharded cluster with 5 nodes that "mimics" my Cassandra cluster configuration?
  • Can ConfigSvr, App Servers run on the same node as the ShardSvr? How to specify the different configs for the processes?
  • How much Replica Sets do I need? I think one Replica Set holds the exact same data so adding all nodes into one replica set won't work. Is there an easier way than to introduce 5 replica sets that contains two nodes?

Here's what I came up with:

  1. How can I set up a MongoDB sharded cluster with 5 nodes that "mimics" my Cassandra cluster configuration?

For this, one can use a sharded cluster without replication. This is also the way how Datastax used MongoDB to benchmark it with the YCSB benchmark against Cassandra and other databases:

No configuration files were used for MongoDB; instead the services were started with the parameters in the command line.  For the first instance that has a configuration database:
bin/mongod --fork --logpath /var/log/mongodb/mongocfg.log --logappend --dbpath /mnt/mongodb/mongocfg –configsvr

Each service was then started:
bin/mongod --fork --logpath /var/log/mongodb/mongodb.log --logappend --dbpath /mnt/mongodb/mongodb --shardsvr --storageEngine wiredTiger
bin/mongos –-fork --logpath /var/log/mongodb/mongos.log --logappend --configdb $MONGO_MASTER

Note that this question was asked because we want to benchmark Cassandra vs MongoDB. This is not suited for a production environment.

  1. Can ConfigSvr, App Servers run on the same node as the ShardSvr? How to specify the different configs for the processes?

Yes, they can. You can either use different config files for the processes or run them on the command line. How to do this is outlined in the MongoDB documentation.

  1. How much Replica Sets do I need? I think one Replica Set holds the exact same data so adding all nodes into one replica set won't work. Is there an easier way than to introduce 5 replica sets that contains two nodes?

When benchmarking the database for your use case, it might also be suitable to use no replication at all for the time of the benchmark. That's what Datastax did too, refer to the link in answer 1.

If you want replication, you can also build up one 3 node replica set and one 2 node replica set with an additional arbiter when you have 5 nodes.

With 6 nodes, you can use three 2 node replica sets, each with an arbiter. This would mimic the replication factor of 1 for Cassandra the most.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM