简体   繁体   English

predis SCAN 比 KEYS 慢

[英]predis SCAN slower than KEYS

I need a method to get all keys by prefix to delete them.我需要一种方法来通过前缀获取所有键以删除它们。

I've read about the KEYS not being suitable for production, so i made a few tests to check performance.我读过关于 KEYS 不适合生产的信息,所以我做了一些测试来检查性能。 Im using predis 1.1.6 (php) and i tested both in my local machine and in a testing AWS environment with elasticache redis.我使用的是 predis 1.1.6 (php),我在本地机器和测试 AWS 环境中使用 elasticache redis 进行了测试。 Im doing this on a node with about 300k items.我在一个大约有 300k 个项目的节点上执行此操作。

Im using prefixes: CLIENT/ID_CLIENT/MODULE:HASH which translates in client/9999/products:452a269b82c199ef27f5a299e3b0f98531216ccf我使用前缀: CLIENT/ID_CLIENT/MODULE:HASH翻译成client/9999/products:452a269b82c199ef27f5a299e3b0f98531216ccf

So i need to search and delete all keys from a client and module.所以我需要从客户端和模块中搜索并删除所有键。 Since i use prefixes, i've set the correct prefix and used the predis keys method:由于我使用前缀,因此我设置了正确的前缀并使用了 predis 键方法:

$this->_redisPrefix('client/9999/products:');
$keys = $this->_redis_client->keys('*');

This is extremely fast, it takes about 50ms.这非常快,大约需要 50 毫秒。

Since KEYS is not recommended on production, i tried to achieve the same thing with SCAN.由于不建议在生产中使用 KEYS,因此我尝试使用 SCAN 来实现相同的目的。 predis does not have a scan method so i needed to this: predis 没有扫描方法,所以我需要这样做:

foreach (new Iterator\Keyspace($this->_redis_client, 'client/9999/products:*') as $key) {
    $keys[] = $key;
}

This returns the exact same results but it took 20 seconds(.), I thought this was something related with my local machine.这将返回完全相同的结果,但花了 20 秒(。),我认为这与我的本地机器有关。 but i've deployed it to our aws enviorment and the response times were the same.但我已经将它部署到我们的 aws 环境中并且响应时间是相同的。 I did not used pagination because i need all the items to be deleted and i dont know how many.我没有使用分页,因为我需要删除所有项目并且我不知道有多少。 It can be 10 or it can be 1000 (or more)可以是 10 也可以是 1000(或更多)

I want to avoid KEYS, but i cannot use SCAN with this kind of timings.我想避免使用 KEYS,但我不能在这种时间点上使用 SCAN。

Using KEYS in production在生产中使用KEYS

First its important to understand why KEYS shouldn't be used in production.首先,重要的是要了解为什么不应该在生产中使用KEYS

KEYS has a time complexity of O(N), where N is the number of elements of the entire database. KEYS的时间复杂度为 O(N),其中 N 是整个数据库的元素个数。 NOT how many satisfies the pattern.不是有多少满足模式。 Since only one command can run at the same time (Redis not being multi-threaded), everything else will have to wait for that KEYS to complete.由于只能同时运行一个命令(Redis 不是多线程的),因此其他所有操作都必须等待该 KEYS 完成。

see: Why KEYS is advised not to be used in Redis?请参阅: 为什么在 Redis 中不建议使用 KEYS?

According to the docs:根据文档:

While the time complexity for this operation is O(N), the constant times are fairly low.虽然此操作的时间复杂度为 O(N),但常数时间相当低。 For example, Redis running on an entry level laptop can scan a 1 million key database in 40 milliseconds.例如,在入门级笔记本电脑上运行的 Redis 可以在 40 毫秒内扫描 100 万个密钥数据库。

Warning: consider KEYS as a command that should only be used in production environments with extreme care.警告:将 KEYS 视为仅应极其小心地在生产环境中使用的命令。 It may ruin performance when it is executed against large databases.当它针对大型数据库执行时,它可能会破坏性能。 This command is intended for debugging and special operations, such as changing your keyspace layout.此命令用于调试和特殊操作,例如更改键空间布局。 Don't use KEYS in your regular application code.不要在常规应用程序代码中使用 KEYS。 If you're looking for a way to find keys in a subset of your keyspace, consider using SCAN or sets.如果您正在寻找一种在键空间子集中查找键的方法,请考虑使用 SCAN 或集合。

This would indicate that if you have less than a million records, using keys should be okay ish.这表明如果您的记录少于一百万,则使用keys应该没问题。 But as your database grows, or you have more concurrent users, issues may arise.但是随着您的数据库增长,或者您有更多的并发用户,可能会出现问题。

Alternatives to KEYS KEYS的替代品

SCAN扫描

A common alternative to KEYS is SCAN (which is what you are using). KEYS的一个常见替代方法是SCAN (这是您正在使用的)。 Note that this is still a bad alternative, as its very similar to KEYS , except that the result is returned in chunks, and has O(N), where N is the number of elements of the entire database.请注意,这仍然是一个糟糕的选择,因为它与KEYS非常相似,只是结果以块的形式返回,并且具有 O(N),其中 N 是整个数据库的元素数。

The advantage is that it doesn't block the server, but it has the same time complexity has KEYS .优点是它不会阻塞服务器,但它具有与KEYS相同的时间复杂度。 In fact, if all you want to get is the result, and don't care about blocking the database, it can be slower than KEYS as it has to perform multiple queries (as you have seen).事实上,如果你只想得到结果,而不关心阻塞数据库,它可能比KEYS慢,因为它必须执行多个查询(如你所见)。

HSET HSET

A much better alternative is to use a HSET.一个更好的选择是使用 HSET。

When you want to put elements into a HSET , use:当您想将元素放入HSET时,请使用:

HSET client/9999/products "id_547" "Book"
HSET client/9999/products "whatever_key_you_want" "Laptop"
$this->_redis_client->hset('client/9999/products', 'id_547', 'Book');
$this->_redis_client->hset('client/9999/products', 'whatever_key_you_want', 'Laptop');

And when you want to get all the keys just use HKEYS :当您想获取所有密钥时,只需使用HKEYS

HKEYS client/9999/products
1) id_547
2) whatever_key_you_want
$this->_redis_client->hkeys('client/9999/products')

Unlike KEYS, the complexity of HKEYS is O(N) where N is the size of the hash (NOT the size of the entire database).与 KEYS 不同, HKEYS的复杂度为 O(N),其中 N 是 hash 的大小(不是整个数据库的大小)。

If the keys get very large you may want to use HSCAN .如果密钥变得非常大,您可能需要使用HSCAN

Performance test性能测试

In a redis database with around 2,000,000 items:在包含大约 2,000,000 个项目的 redis 数据库中:

for ($i = 0; $i <= 100; $i++) {
    $client->set("a:{$i}", "value{$i}");
}
for ($i = 0; $i <= 100; $i++) {
    $client->hset("b", $i, "value{$i}");
}

Test 1: KEYS测试 1:按键

$start = microtime(true);
var_dump(count($client->keys('a:*')));
$end = microtime(TRUE);
echo ($end - $start) . "s\n";

Test 2: SCAN测试 2:扫描

$start = microtime(true);
$count = 0;
foreach (new Keyspace($client, 'a:*') as $key) {
    $count++;
}
$end = microtime(TRUE);
echo ($end - $start) . "s\n";

Test 3: HKEYS测试 3:HKEYS

$start = microtime(true);
var_dump(count($client->hkeys('b')));
$end = microtime(TRUE);
echo ($end - $start) . "s\n";

Results结果

  • KEYS: ~0.21s键:~0.21s
  • SCAN: ~20s扫描:~20 秒
  • HKEYS: ~0.01s HKEYS:~0.01s

As you can see, HKEYS is much faster, and is unaffected by the size of the database.如您所见, HKEYS速度更快,并且不受数据库大小的影响。

I also recommend using redis PECL extension instead of predis:我还建议使用 redis PECL 扩展而不是 predis:

With Redis extension I got:使用 Redis 扩展我得到:

  • KEYS: ~0.21s (not much change) KEYS:~0.21s(变化不大)
  • SCAN: ~17s (small increase) SCAN:~17s(小幅增加)
  • HKEYS: ~0.0004s (much faster!) HKEYS:~0.0004s(快得多!)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM