简体   繁体   中英

Confusion about hash tables

I am currently studying for some interviews, and I've heard that at some of these interviews people are sometimes asked to build a data structure from scratch, including a hash table. However, I am having some trouble ..really understanding hash tables from a programming perspective.

I've been building these data structures from scratch using C++, and I know that using templates I can create linkedlists, dynamic arrays, binary search trees, etc, that can basically store whatever type of object (as long as that object is the only type that can be stored in that instance of the hash table). So I would assume I could create a template or "generic" hash table that depending on the instance of the hash table, could store a particular object. But I have two things that confuse me:

  1. I know that the through a hash function, the different keys are mapped to different indices in the array that makes up the hash table. But let's say you are using the hash table you created to store objects of type Book, and then let's say you create another hash table to store objects of type People. Obviously, different types of objects will have different member attributes, and one of these attributes would have to be the key. Would this mean that basically every object that you would ever want to store on the hash table you created would have to have at least one attribute that has the same name? Because your hash function would have to have some key value to hash, so it would have to know which attribute of the object it is using as a key to hash? So for example, every object that you would wanna store in this hash table would have to have an attribute called "key" that you can use when using a hash function to map to an index of the array, no? Otherwise, how would it know what "key" to hash?

  2. This would also lead to the problem of the hash function...I've read that depending on the datasets you're given, some hash functions are better than other. So if the hash function depends on the dataset, how could you possibly create a hash table data structure that could store any type of object?

So am I just overthinking this? Should I just learn to create an easy hash table that hashes integers when practicing for my interviews? And are hash tables in real life created generically, or do people usually come up with a different hash table depending on the type of data they have?

If this question is better suited for the Computer Science theory stack exchange, please let me know. I am just finding these little details are keeping me from truly understanding this data structure.

You need to seperate the hash table from the hash function, these are different functionalities.
There are two common practices to keep your hash table generic and still be able to properly hash objects.

  1. The first is to assume your template type (let it be T ) implements the hash method, and use it. You don't care how it is being implemented, as long as you have it.
  2. The other option is to have in addition to the template type, a template function hash(T) , that needed to be provided when declaring a hash table.

This basically solves both problems: The user, who knows the data distribution better than the library reader, is supplying the hash function, and the supplied hash function works on the supplied type, regardless of what the "key" is.

If chosen the 2nd option, you could implement some default hash functions for the known and primitive types, so users won't need to reinvent the wheel for each usage of the library, when using standard types.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM