简体   繁体   中英

maintaining cache state in different servers

This may be a dumb question, but i am not getting what to google even. I have a server which fetches the some data from DB, caches this data and when ever any request involves this data, then data is fetched from cache instead of from DB.There by reducing the time taken to serve the request. This cache can be modified, ie may be some key can get added to it or deleted or updated. Any change which occurs in cache will also happen on DB. The Problem is now due to heavy rush in traffic we want to add a load balancer infront of my server. Lets say i add one more server. Then the two servers will have two different cache. if some thing gets added in the first server cache, how should i inform the second server cache to get it refreshed??

If you ultimately decide to move the cache outside your main webserver process, then you could also take a look at consistent hashing . This would be a alternative to a replicated cache.

The problem with replicated caches, is they scale inversely proportional to the number of nodes participating in the cache. ie their performance degrades as you add additional nodes. They work fine when there is a small number of nodes. If data is to be replicated between N nodes (or you need to send eviction messages to N nodes), then every write requires 1 write to the cache on the originating node, and N-1 writes to the other nodes.

In consistent hashing, you instead define a hashing function, which takes the key of the data you want to store or retrieve as input, and it returns the id of the server in the cluster which is responsible for caching the data for that key. So each caching server is responsible for a fraction of the overall keys, the client can determine which server will contain the sought data without any lookup, and data and eviction messages do not need to be replicated between caching servers.

The "consistent" part of consistent hashing, refers to how your hashing function handles new servers being added to or removed from the cluster: some re-distribution of keys between servers is required, but the function is designed to minimize the amount of such disruption.

In practice, you do not actually need a dedicated caching cluster, as your caches could run in-process in your web servers; each web server being able to determine the other webserver which should store cache data for a key.

Consistent hashing is used at large scale. It might be overkill for you at this stage. But just be aware of the scalability bottleneck inherent in O(N) messaging architectures. A replicated cache is possibly a good idea to start with .

EDIT: Take a look at Infinispan , a distributed cache which indeed uses consistent hashing out of box.

Any way you like ;) If you have no idea, I suggest you look at or use ehcache or Hazelcast. It may not be the best solutions for you but it is some of the most widely used. (And CV++ ;) I suggest you understand what it does first.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM