简体   繁体   中英

Cassandra Quorum : Consistency Level

I have a 3 DC ring in Cassandra with each DC having a 4 node cluster. So its 4 nodes*3(DC) = 12 nodes. I'm testing how Cassandra behaves when some nodes go down when we have Quorum consistency level. We have set a replication factor of 3 on each datacenter. So our

Quorum = Floor(Sum of Replication FActor/2) + 1. RF = 3 quorum= 5.

In theory if I have five nodes in my 12 node cluster, I should be good for read and write. So I brought down a full Datacenter DC1, and 3 nodes in another datacenter(Dc2). So I have 1 node up in DC2 and whole of DC3(4 nodes). I have 5 nodes up. By theory, this should be good for my writes to be succesfull in quorum consistency. But, when I ran, I get

Cassandra.Unavailable Exception: Not enough replica available for query at consistency ONE (5 required but only 4 alive) .

But, I do have 5 nodes alive. What am I missing here ?

QUORUM by itself, refers to members of same data-center. Which in your case DC3 has of 4. But you asked for QUORUM of 5, which DC3 cannot provide. That is why there is concept like ONE and LOCAL_ONE.

I am pretty sure you will get same error at QUORUM 5, even if your all DC nodes are up.

You can refer : http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html

From my point of view the operations should and will fail.

From the DC that is up you can guarantee at any time 3 replicas, RF 3.

The node up in the other DC has ~60% to nail down another replica.

3 + 1 = 4.

You`re asking for CL 5.

5 > 4 => fail.

Quorum is for the entire cluster and Local_Quorum is for a single Data center. Just some basics to understand, cassandra is distributed systems meaning data is distributed in your cluster with each node owning a primary range and at the same time replicating data of other nodes. This means nodes which are responsible to store a piece of data are the only nodes which are calculated for Consistency. In your case 5 nodes are up does not mean Quorum consistency is met for the writes or reads, because the DC with all nodes up will definitely have data in atleast 3 nodes (remember your RF is 3), but the DC with only 1 node will either have or not have data you are querying.

In your case if you hit the DC with all node up using a Local_quorum you will get correct results.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM