简体   繁体   English

使用单独的链接解析技术调整 hash 表的大小究竟意味着什么?

[英]What does it really mean by resizing hash table with separate chaining resolution technique?

I am learning hash table by using separate chaining using linked lists and there comes an instance where we have to resize the hash table .我正在通过使用链表的单独链接来学习 hash 表,并且出现了一个实例,我们必须调整hash table的大小。 But I am not getting why do we need to resize it as it works the same with or without resizing it.但我不明白为什么我们需要调整它的大小,因为无论是否调整大小它都一样。 I read somewhere that it has to do something with the time complexity of the search, insert and remove options.我在某处读到它必须与搜索、插入和删除选项的time complexity有关。 But how can these be affected with resizing the table.但是resizing表格大小如何影响这些。 Its a request to please answer in as simple language as possible as I am new to these things and sorry for any english mistakes as this is not my primary language.请求以尽可能简单的语言回答,因为我对这些东西不熟悉,对任何英语错误表示抱歉,因为这不是我的主要语言。

In short - smaller hash tables have a higher collision rate which means more effort is needed to ensure correctness.简而言之 - 较小的 hash 表具有更高的冲突率,这意味着需要更多的努力来确保正确性。

Consider a simple example of a hash table as an array of lists.考虑将 hash 表作为列表数组的简单示例。 If two items map down to the same array position (slot), you add that value to the list at that position.如果两个项目 map 下降到同一个数组 position(插槽),则将该值添加到该 position 的列表中。 On retrieval, you find the slot, iterate that list, and look for an item with the requested key.在检索时,您会找到插槽,迭代该列表,然后查找具有请求键的项目。 The slot of an incoming hash key is calculated by hash key % array length .传入的 hash 密钥的插槽由hash key % array length计算。

Now consider an array with 7 elements as the backing store (hash table sizes are generally prime numbers), and you store 10k items in it.现在考虑一个包含 7 个元素的数组作为后备存储(哈希表大小通常是素数),并在其中存储 10k 个项目。 Each slot in the backing array is going to have ~1400 items in it due to the limited storage and a high number of collisions.由于有限的存储空间和大量的冲突,后备阵列中的每个插槽将包含约 1400 个项目。 When you ask for the value with key x , you're going to have to look through those ~1400 items for the correct one to return.当您使用键x请求值时,您将不得不查看这大约 1400 个项目以返回正确的项目。

By bumping the array size to 73, each slot now only contains ~130 items - a big reduction in the amount of work, particularly on retrieval.通过将数组大小增加到 73,每个插槽现在只包含约 130 个项目 - 大大减少了工作量,特别是在检索方面。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM