简体   繁体   中英

Implementing a Hash Table using C

I am trying to read a file of names, and hash those names to a spot based on the value of their letters. I have succeeded in getting the value for each name in the file, but I am having trouble creating a hash table. Can I just use a regular array and write a function to put the words at their value's index?

while (fgets(name,100, ptr_file)!=NULL) //while file isn't empty
            {
            fscanf(ptr_file, "%s", &name); //get the first name
            printf("%s ",name); //prints the name
            int length = strlen(name); // gets the length of the name
            int i;
            for (i =0; i <length; i++) 
            //for the length of the string add each letter's value up
                {
                value = value + name [i];
                value=value%50;
                }
            printf("value= %1d\n", value);
            }

No, you can't, because you'll have collisions, and you'll thus have to account for multiple values to a single hash. Generally, your hashing is not really a wise implementation -- why do you limit the range of values to 50? Is memory really really sparse, so you can't have a bigger dictionary than 50 pointers?

I recommend using an existing C string hash table implementation, like this one from 2001.

In general, you will have hash collisions and you will need to find a way to handle that. The two main ways of handling that situation are as follows:

  1. If the hash table entry is taken, you use the next entry, and iterate until you find an empty entry. When looking things up, you need to do similarly. This is nice and simple for adding things to the hash table, but makes removing things tricky.

  2. Use hash buckets. Each hash table entry is a linked list of entries with the same hash value. First you find your hash bucket, then you find the entry on the list. When adding a new entry, you merely add it to the list.

This is my demo-program, which reads strings (words) from stdin, and deposit into hashtable. Thereafter (at EOF), iterate hashtable, and compute number of words, distinct words, and prints most frequent word:

http://olegh.cc.st/src/words.c.txt

In hashtable, utilized double hashing algorithm.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM