简体   繁体   中英

Memcached for PHP and failover

We are deploying memcached for our application, and I would like to make it as resistant as I can.

We plan to use the newer memcacheD extension.

One thing that I haven not fully figured out is, what happens if one of the servers dies. At the very least it seems like the memcached client just 'gives up' on that server, and doesn't store anything in it.

This behavior I'm fine with. We can deal with a bunch of cache-misses. However, it would be nice that after one of the servers is deemed 'failed' the subsequent sets and gets get re-distributed to the servers that are left over.

Since this does not seem to happen automatically; I guess the only way to approach this issue is to have an external system do health-checks on memcached systems, and update the list of servers appropriately.

But if there's a list of 10 servers, and lets say, the 5th dies.. even with the Ketama-hashing it would seem that this would trigger a big redistribution of keys (this is just based on common sense).

So ideally, I would simply like the PHP extension to figure out a server is down, mark it for down for a specified amount of time (10 minutes) and during those 10 minutes fall back to other servers (nicely distributed) for sets and gets.

How do other people solve this?

Edit : clarifying my libketama point.

Say we have 10 servers:

1,2,3,4,5,6,7,8,9,10

One of them dies. Libketama will then provide a very high likelyhood that the hits to the missing server get equally distributed to the remaining servers:

1,2,3,4,inactive,6,7,8,9,10

BUT: if we provide and manage this list manually, this is not the case:

1,2,3,4,6,7,8,9,10 // There are now 9 servers!

6 will get now 5's previous keys, 7 will get 6's. 8 will get 7's, 9 will get 8's and 10 will get 9's. All the hits the 10th servers used to get, will not be evenly distributed among the remaining ones. Resulting a high likelyhood of almost 50% of all keys being sent to new servers.

I generally store my list of available servers in APC, so I can modify it on the fly. You're correct in that systems will attempt to continue using the down server while it's listed, luckily with the new hashing methods it's not a big deal to pull it from rotation.

I would avoid using a brand new PHP extension, or trying to add new software to your deployment stack. You're likely already using something for monitoring (nagios?). Having it invoke a simple PHP script on each of your webservers to tweak the in-memory list seems like the best bet.

It's worth noting that under the Ketama hashing system , removing a server from rotation will result in its keys being re-hashed elsewhere on the ring (continuum), other servers will not see their keys assigned elsewhere. Visualize it as a circle, each server is assigned multiple points on the circle (100-200). Keys are hashed to the circle and continue clockwise until they find a server. Removing a server from the ring only results on those values continuing a bit further to find a new server. With luck the distribution of values will hit the remaining servers equally.

Demonstrating the hashing system:

<?php


$m = new Memcached();
$m->setOption(Memcached::OPT_DISTRIBUTION, Memcached::DISTRIBUTION_CONSISTENT);


$m->addServer('localhost', 11211);
$m->addServer('localhost', 11212);
$m->addServer('localhost', 11213);
$m->addServer('localhost', 11214);
$m->addServer('localhost', 11215);
$m->addServer('localhost', 11216);
$m->addServer('localhost', 11217);
$m->addServer('localhost', 11218);
$m->addServer('localhost', 11219);
$m->addServer('localhost', 11210);

$key = uniqid(); //You may change this to md5(uniqid()); if you'd like to see a greater variation in keys. I don't think it necessary.
$m->set($key, $key, 5);


var_dump($m->get($key));

unset($m);


$m = new Memcached();
$m->setOption(Memcached::OPT_DISTRIBUTION, Memcached::DISTRIBUTION_CONSISTENT);
//one server removed. If assignment to the continuum is dependent based on add order, we would expect the get call here to fail 90% of the time, as there will only be a success if the value was stored on the first server. If the assignment is based on some hash of the server details we'd expect success 90% of the time. 
$m->addServer('localhost', 11211);
//$m->addServer('localhost', 11212);
$m->addServer('localhost', 11213);
$m->addServer('localhost', 11214);
$m->addServer('localhost', 11215);
$m->addServer('localhost', 11216);
$m->addServer('localhost', 11217);
$m->addServer('localhost', 11218);
$m->addServer('localhost', 11219);
$m->addServer('localhost', 11210);

var_dump($m->get($key));

unset($m);

$m = new Memcached();
$m->setOption(Memcached::OPT_DISTRIBUTION, Memcached::DISTRIBUTION_CONSISTENT);
//2 servers removed
$m->addServer('localhost', 11211);
$m->addServer('localhost', 11212);
//$m->addServer('localhost', 11213);
//$m->addServer('localhost', 11214);
$m->addServer('localhost', 11215);
$m->addServer('localhost', 11216);
$m->addServer('localhost', 11217);
$m->addServer('localhost', 11218);
$m->addServer('localhost', 11219);
$m->addServer('localhost', 11210);

var_dump($m->get($key));

unset($m);

$m = new Memcached();
$m->setOption(Memcached::OPT_DISTRIBUTION, Memcached::DISTRIBUTION_CONSISTENT);
//Out of order
$m->addServer('localhost', 11210);
$m->addServer('localhost', 11211);
$m->addServer('localhost', 11219);
$m->addServer('localhost', 11212);
$m->addServer('localhost', 11217);
$m->addServer('localhost', 11214);
$m->addServer('localhost', 11215);
$m->addServer('localhost', 11216);
$m->addServer('localhost', 11218);
$m->addServer('localhost', 11219);
$m->addServer('localhost', 11213);

var_dump($m->get($key));

unset($m);

If the hashing system cares about order, or omitted servers we would expect to get bool(false) on most of the secondary examples, since an early server was removed etc. However based on my quick, completely non-scientific tests, I only get a bool false in any particular slot one time in 10. I clearly just launched 10 servers on my test box. Giving each of them only 4mb of ram

You might wanna try out the Memcached::OPT_AUTO_EJECT_HOSTS option constant for PHP. It's not directly documented but there is a comment here naming it.

(I haven't tried it, so I can't tell you whether it'll work or not)

Based on the answer to the comments, I would suggest along this lines:

You'll need to build a caching class.

This class will contain the following information:

  • List of cache servers

    • Status for online or offline
    • Count of requests to this server
  • List of keys currently stored and what server it is on

Next you'll need your standard functions to add, update, and delete keys.

Each time you execute one of these functions, you'll want to check if the key is already in the cache and what server it is on.

If it is not in a server, pick the server with the lowest requests to save it on after retrieving the actual DB value.

If any of these functions return an error from the cache server, I would mark that server as offline, reset the count, and remove any keys from the list that are on this server.

At this point, you could easily automatically move them to a new server or just delete them so they will be queried again.

My 2 cents: It is NOT going to be easy to develop a robust HA module for Memcached. For example think about the following cases:

  • How will you determine which server is alive and which one died ? you should somehow sync between all your HA modules running your web/app servers
  • How do you publish this information between your web/app servers
  • Are you going to have an orchestrator ?

I suggest that you have a look at Redis Sentinel which is now in Beta, and has been developed and tested during the last few months specifically for solving these problem in Redis. You wil find a lot of corner cases there that you must be aware of before starting to develop a single line of code.

As for the other issues that were discussed here:

  • When you lose node you lose 1/N of your keys where N is the number of nodes you initially had, ie including the failed node. This is how Ketama works
  • Storing keys on the Memcached client using a new Memcached class is definitely not the way to go (IMO): (1_ where are you going to save all these keys? (2) how do you sync between your web/app nodes ? (3) how long it will take you to access these data-structures in-order to understand on which node each key is resided ? -- this is why Memcached is completely based on a hash function, to make it fast and simple.

Last but not least, I would suggest that you also check Memcached as-a-service solutions. For instance, we at Garantia Data have already solved the HA issues of Memcached.

Disclosure: I'm Co-Founder and CTO of Garantia Data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM