简体   繁体   中英

consistency level tuning in cassandra

Imagine a E-commerce application:

Let's say I have three Node cluster N1, N2, N3. and my consistency level (CL) is weak: That is

Read CL = N/2+1 = 2 (in this case), Write CL = Any (alteast 1)

I have a product table such as

This is the initial data that is in sync across three Nodes

 product_info : { 'computer': 1}
  1. Now Client A reads the info from N1 and Client B reads the info from N2

    Client 1 sees 1 computer is available

    client 2 sees 1 computer is available

  2. Both of them now go for buying Client A places the order first. so N1, the table will look like the following:

    product_info : {'computer':0}

  3. and now client 2 makes the order so at N2, the table will look like the following:

    product_info : {'computer':0}

    But in reality client 2's order should not have been processed.

  4. client C access through N3. Now a read is done at N1 which returns 0. (since quorum at least 2 nodes should respond) N3 has value of 1 but its time-stamp is outdated. so It will update its value and shows to client that no computers are available. This is good

    In this example, both weak and strong consistency level will lead to wrong results, simply because at the time when the first product_info is loaded by client A and B, the data is in sync. How can this be handled in Cassandra?

You haven't mentioned your replication factor.

If your read consistency + write consistency > replication factor, you WILL get immediate consistency.

Let's say your replication factor is 3. For immediate consistency and RC = 2, you will need WC at least 2. If you want immediate consistency and WC = 1, your RC will need to be 3. Note, this would seriously impact availability as one node going down would mean you can't read.

Immediate consistency means that you will read whatever's been written. ie after a successful write, no read will be reading the previous value. However, this does NOT prevent your application using a value it has previously read.

You can use lightweight transactions in this case. Update ..... IF [some condition.]. This will perform slower but may be enough for your use case.

Quite often, specially in distributed scenarios, it is better to deal with failure - even make it a business case - instead of trying to prevent anything "bad" from ever happening. Edge cases like this are opportunities to talk with the business, and find hidden opportunities:

  • What happens if we overbook an item?
  • Is it better to cancel an order, or let the customer know that their order has been inevitably delayed, possibly making the sale and giving them a gift voucher.
  • Can we give the customer a slightly better computer taking a slight hit on profit? This can help us make the sale, satisfy the customer and possibly give us return business. Dell often does this.
  • Can we call up the customer and explain the scenario, potentially upselling?

We can even accept the order and let one customer know when we find that there's an issue - I've personally seen this with Amazon.

If we absolutely must prevent any overselling at sell time, then there are patters for that as well. We can use a distributed lock using something like raft or even zookeeper to handle coordination outside of cassandra. We can also implement logical locks with TTLs for each item - with TTLs to ensure messy code doesn't mess up inventories.

It really depends on how string a guarantee you want, and how much trouble you're willing to go through to achieve this. And more so, if it's not more profitable to not solve it.

Hope that helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM