简体   繁体   English

使用python在Redis中进行键压缩

[英]key compression in redis with python

I have very large data set and I'm looking into using Redis . 我的数据集非常大,正在考虑使用Redis My data set consists of: sha1 hash and additional n value(s) that is associate with that hash. 我的数据集包括: sha1哈希和与该哈希关联的其他n值。

I use my sha1 hash as a key inside of Redis and my goal is to compress it somehow). 我将sha1哈希用作Redis的键,我的目标是以某种方式对其进行压缩)。 I tried to use zlib and then base64 , but new hash is even longer then original sha1 hash: 我尝试使用zlib然后使用base64 ,但是新的哈希值比原始的sha1哈希值还要长:

[alexus@wcmisdlin02 ~]$ python
Python 2.7.5 (default, Nov 20 2015, 02:00:19) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> hashlib.sha1('test').hexdigest()
'a94a8fe5ccb19ba61c4c0873d391e987982fbbd3'
>>> len(hashlib.sha1('test').hexdigest())
40
>>> import zlib
>>> len(zlib.compress(hashlib.sha1('test').hexdigest()))
46
>>> import base64
>>> base64.b64encode(zlib.compress(hashlib.sha1('test').hexdigest()))
'eJwFwQkBACAIA8BKIihbHB7pH8G7oAXmnaoUZlwpqwXXVsojnNiT2foB7msLYg=='
>>> len(base64.b64encode(zlib.compress(hashlib.sha1('test').hexdigest())))
64
>>> 

any ideas how to go about it? 任何想法如何去做?


I'm looking into following as well: 我也在研究以下内容:

Results of a hash function are more or less random data, so you can't really compress them. 哈希函数的结果或多或少是随机数据,因此您无法真正对其进行压缩。 Depending on how may objects/hashes you're dealing with and how much you care about collisions, you could just use a part of the sha1 value if you want to save space or use an algorithm with a shorter digest size (md5, half of sha2-256). 根据要处理的对象/哈希的方式以及对碰撞的关注程度,如果要节省空间或使用摘要大小较小的算法(md5,一半的一半),则可以只使用sha1值的一部分sha2-256)。 Also, you could not convert the digest from hex and use raw binary and save 50%. 另外,您无法将摘要从十六进制转换为原始二进制文件并节省50%。 Finally, define "very large", it's very likely that the size of the key won't make a big difference. 最后,定义“非常大”,很可能密钥的大小不会有很大的不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM