简体繁体 English

了解哈希表和彩虹表

[英]Understanding Hash Tables and rainbow tables

原文 2016-01-31 18:56:39 4 2 hash/ cryptography/ hashtable/ rainbowtable

So I'm trying to better understand hash tables and rainbow tables and in my reading I feel like i'm starting to get the hang of it. 因此，我试图更好地理解哈希表和Rainbow表，并且在阅读时感觉就像我开始掌握它一样。 There's a check your knowledge question that goes about like this: 有一个检查您的知识问题，如下所示：

"If you have a hash table storing sha-256 passwords and you want the entire table to be stored in memory and you have 4GB of memory, how many passwords can you crack? If you use a Rainbow table with 20 passwords in each chain, how many passwords can you crack?(assuming that passwords are 10 characters)" “如果您有一个哈希表存储sha-256个密码，并且希望将整个表存储在内存中，并且您有4GB的内存，那么您可以破解多少个密码？如果您使用每个表中包含20个密码的Rainbow表，您可以破解多少个密码？（假设密码为10个字符）”

This one totally made me question if I knew anything about what I had been reading. 这完全让我发问，我是否对自己所读的书一无所知。 So this is what I came up with so far. 所以这就是我到目前为止提出的。

If every ShA-256 hash is always 256 bits in size and we know that a single megabyte has 8388608 bits in it that equals 32768 SHA-256 passwords per meg. 如果每个ShA-256散列的大小始终为256位，并且我们知道一个兆字节中包含8388608位，则每兆等于32768 SHA-256密码。 4000 megs, so we take the 32768 and multiply by 4000 and come up with 131072000 passwords stored in memory. 4000兆，因此我们将32768乘以4000，得出131072000密码存储在内存中。

But how do I apply that to 20 chain passwords in a rainbow table? 但是，如何将其应用于Rainbow表中的20个链式密码？ I thought a rainbow table stored hashes and the reverse of them so that while it took up more space it could resolve a lot faster. 我以为彩虹表存储了哈希值和它们的相反值，因此虽然占用了更多空间，但可以更快地解决。 Is there a formula or something for determining how much space I lose and thus how many passwords I lose? 是否存在确定丢失多少空间以及丢失多少密码的公式或方法？

Any help or knowledge is much appreciated. 任何帮助或知识，我们将不胜感激。 I thank you for your time and wisdom. 感谢您的时间和智慧。 :) :)

2 个解决方案

imagine a rainbowtable like this: 想象这样一个彩虹表：

a table is a list of chains 一个表是一个链表

a chain is a password and a hash 链是密码和哈希

but wait ... lets call this password P1 and the hash in the chain we call He 但请稍候...让我们将此密码称为P1，并将其称为He中的哈希

let's further say we have some hash function h(x) and some reduction function R(x) which will assign an output of h(x) to an arbitrary but evenly distributed password in our keyspace 进一步说，我们有一些哈希函数h（x）和一些归约函数R（x），它们会将h（x）的输出分配给密钥空间中任意但均匀分布的密码

if you have a chainlength of 20 that simply says this: 如果您的链长为20，则简单地说：

take P1 ... calculate H1=h(P1) 取P1 ...计算H1 = h（P1）
caluclate P2 as R(h1) ... calculate H2 as h(P2) 将P2计算为R（h1）...将H2计算为h（P2）
calculate Pn as R(hn-1) ... calculate Hn as h(Pn) 将Pn计算为R（hn-1）...将Hn计算为h（Pn）
until after 20 steps we habe P20 and H20 ... which is also He 直到20个步骤之后，我们才有了P20和H20 ...

now we store P1 and He ... aka P1 and H20 现在我们存储P1和He ...又名P1和H20

this is a chain 这是一条链

a table consists of a list ... a sorted list of chains ... sorted by the hash if you have some hash x to be cracked, do this: 一个表包含一个列表...链的排序列表...如果需要破解一些哈希x，则按哈希排序，请执行以下操作：

assign y = x 分配y = x
look for y in your table 在表中寻找y
if found, take the password of the corresponding chain, and rebuild all password/hash tuples that once formed the chain and look for your password ... 如果找到，请获取相应链的密码，然后重建曾经形成链的所有密码/哈希元组，并查找您的密码...
if not found, assign y = h(R(y)) and start over until you either get a match, or reach the chain length 如果找不到，则指定y = h（R（y））并重新开始，直到获得匹配或达到链长

so... in terms of your initial question ... 所以...就您最初的问题而言...

if you use a plain dictionary to lookup passwords, you need to store pairs of passwords and hashes ...one password for each hash ... one pair/tuple will bring you the ability to attack one password 如果您使用普通字典来查找密码，则需要存储成对的密码和哈希值...每个哈希对应一个密码...一对/元组将使您能够攻击一个密码

if you use a rainbow table, you will still be storing one password per hash you have in memory... but the time memory tradeoff will allow you to attack many more hashes... in an ideal world that would be a multiplier of your chain length ... in a real world, that depends on how good R() is ... collisions may occur, which will lead to one password/hash being present in more than one chain, introducing redundancy to your rainbowtable 如果您使用彩虹表，您仍将在内存中为每个哈希存储一个密码...但是时间内存的折衷将使您能够攻击更多的哈希...在理想的世界中，这将是您的乘数链长...在现实世界中，这取决于R（）的好坏...可能会发生冲突，这将导致在多个链中存在一个密码/哈希，从而给您的Rainbowtable带来冗余

With rainbow tables, you only store a fraction of the hashes that you are able to crack. 使用彩虹表，您只存储了可以破解的一部分哈希。 Hahses are organized in chains and only the first and the last element of the chain need to be stored. Hahses是按链条组织的，仅需要存储链条的第一个和最后一个元素。 So with a chain length of 20, you will be storing 2 hashes per chain and be able to crack 20 hashes. 因此，如果链长为20，则每个链条将存储2个哈希，并且能够破解20个哈希。 You thus have a gain of a factor 10. 因此，您的收益增加了10倍。

So whatever result you had without rainbow tables (131072000) you multiply by 10 and get the number of passwords you would be able to crack if you used rainbow tables with a chain length of 20. 因此，无论没有彩虹表（131072000）的结果如何，您都将乘以10并得到密码数量（如果您使用的彩虹表的链长为20，则可以破解）。

Actually the chains are made of hashes and passwords in alternation. 实际上，链条由哈希和密码交替组成。 So you could chose to store the beginning and the end of the chains as password and not hashes. 因此，您可以选择将链的开头和结尾存储为密码而不是哈希。 Since the password space if definitely smaller than the hash space, you can store the beginning and the end of each chain a as a compressed form of the password and gain some memory to be able to store even more chains. 由于密码空间如果绝对小于哈希空间，则可以将每个链的开始和结尾存储为密码的压缩形式，并获得一些内存以能够存储更多链。