简体   繁体   中英

The replication factor in Cassandra when creating a keyspace

When creating a new namespace in Cassandra, we need to give a number for a replication factor. Ex: 在此处输入图片说明

Does the number, that we are giving as the replication factor, determine the number of nodes that initially create to store the replicate data? Can anybody give a clear clarification about what that replication factor does?

It will not create the number of nodes specified. It just means the number of copies of data. For instance if your cluster is having 5 nodes, your write will be replicated(written) to 3 different nodes depending on the token range it falls. Coming to SimpleStrategy its asn implementation where it does not consider rack or dc's into consideration when replicating.

The explanation @Praneeth Gudumasu given for replication_factor is true. The number of nodes in a Cassandra cluster is not something you "give", you can actually connect as many number of nodes as you wish: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddNodeToCluster.html

and each time you connect a new node it is assigned a token range as per Cassandra's architecture. If you don't know how many nodes you need for your application I suggest running a performance test with data size approaching the size you would be inserting in your real application, then try to execute some queries (concurrently) and see with how many nodes you would get a reasonable response time for your queries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM