简体   繁体   English

C 中的哈希散列

[英]Hash of Hash in C

I'm kind of new in C because moving back to that after 6 years!我对 C 有点陌生,因为 6 年后又回到了 C! wanna Implement a code to store tree for this data like this :想要实现一个代码来存储此数据的树,如下所示:

string1 = "foo.bar.foo.\\*" string2 = "foo.baz.\\*" string3 = "foo.\\*.bar" string1 = "foo.bar.foo.\\*" string2 = "foo.baz.\\*" string3 = "foo.\\*.bar"

     foo
      |
  |   |   |
 bar baz  *
  |   |   |
 foo  *  bar
  |
  *

trying to do this with HashTable :试图用 HashTable 做到这一点:

struct entry_s {
    char *key;
    struct entry_s *value;
    struct entry_s *next;
};

but I dont think it works, what is the best way to do that and even though is Hash Map the best data structure that could be used in C?但我不认为它有效,最好的方法是什么,即使哈希映射是可以在 C 中使用的最好的数据结构?

It seems that you want to implement a map with two levels: The first level maps strings to second-level maps and the second-level maps another key to a value.看来您要实现一个具有两个级别的映射:第一级将字符串映射到第二级映射,第二级将另一个键映射到值。 For example, in Javascript syntax:例如,在 Javascript 语法中:

data = {
    "London": {
        "Paris": 450
    },
    "Paris": {
       "Madrid": 600,
       "Algiers": 700
    }
}

There are several ways to achieve this.有几种方法可以实现这一点。

Javascript variables carry their types with them, so yozu can use the same Map implemantation for both levels. Javascript 变量带有它们的类型,因此 yozu 可以对两个级别使用相同的Map实现。 In C, you could implement two hash tables with different value types, eg:在 C 中,您可以实现两个具有不同值类型的哈希表,例如:

struct OItem {                  // Outer map item
    const char *key;                // string key
    struct IMap *value;             // inner map value
    struct OItem *next;         
};

struct OMap {                   // Outer map
    struct OItem *head[oSize];      // hash table
};

struct IItem {                  // Inner map item
    const char *key;                // string key
    int value;                      // integer value
    struct IItem *next;
};

struct IMap {                   // Inner map
    struct IItem *head[iSize];      // hash table
};

This will give you the two-level structure above.这将为您提供上面的两级结构。 (These hash tables have fixed sizes, so that you might end up wasting a lot of space when for example the second-level maps are sparse. Perhaps it might be better to use just a single list or a balanced tree here. If you use the second-level map just emulates an object that always hash the same or similar data, consider using a struct here.) (这些哈希表的大小是固定的,因此当例如二级映射稀疏时,您最终可能会浪费大量空间。也许在这里只使用单个列表或平衡树可能会更好。如果您使用二级映射只是模拟一个总是散列相同或相似数据的对象,考虑在此处使用结构。)

You can use this structure and lookup("London", "Paris") , for example.例如,您可以使用此结构和lookup("London", "Paris") If you don't need access to the inner map, you could also pack both levels into one big hash table by using two keys:如果您不需要访问内部映射,您还可以使用两个键将两个级别打包到一个大哈希表中:

struct Item {
    const char *key1;
    const char *key2;
    int value;
    struct Item *next;
};

struct Map {
    struct Item *head[hSize];
};

When you calculate a hash, use both keys, for example:计算散列时,请使用两个键,例如:

static unsigned int hash(const char *s1, const char *s2)
{
    unsigned long hash = 5381u;

    while (*s1) hash = hash * 33 ^ *s1++;
    hash = hash *33;
    while (*s2) hash = hash * 33 ^ *s2++;

    return hash;
}

When you look up an item, ensure that both keys match:查找项目时,请确保两个键匹配:

int map_find(const struct Map *map,
    const char *k1, const char *k2)
{
    unsigned int h = hash(k1, k2) % hSize;
    struct Item *item = map->head[h];

    while (item) {
        if (strcmp(item->key1, k1) == 0
         && strcmp(item->key2, k2) == 0) {
            return item->value;
        }

        item = item->next;
    }

    return 0;
}

This approach is perhaps more restrictive, but has the advantage that you don't have many potentially oversized hash tables, but just one data structure.这种方法可能更具限制性,但其优点是您没有许多可能超大的哈希表,而只有一种数据结构。

Finally, whatever you do, don't use the hash table implementation you found on GitHub.最后,无论您做什么,都不要使用您在 GitHub 上找到的哈希表实现。 The author admits that it was more of a coding exercise.作者承认这更像是一种编码练习。 It doesn't deal with releasing the memory after use and has a poor hash function.它不处理使用后释放内存,并且哈希函数很差。


After you edited in your actual use case, it is clear that you want a trie .在您的实际用例中进行编辑后,很明显您需要一个trie You can implement a trie as you suggested.您可以按照您的建议实施尝试。 The keys and values can be anything in your implementation, so they can also be strings and trie nodes.键和值可以是您的实现中的任何内容,因此它们也可以是字符串和特里节点。 You can adappt your existing implementation to use a pointer to a trie node structure as values.您可以调整现有实现以使用指向特里树节点结构的指针作为值。 (All the comparison stuff stays the same, fortunately.) (幸运的是,所有比较的东西都保持不变。)

One problem I see is that with a fixed hash-table size, you will end up wasting a lot of space.我看到的一个问题是,使用固定的哈希表大小,最终会浪费大量空间。 If your trie is sparse, it might be better to just use a linked list or a balanced binary tree as map.如果您的特里树稀疏,最好只使用链表或平衡二叉树作为映射。 In any case, you will have to find a suitable lib or roll your own.在任何情况下,您都必须找到合适的库或自己动手。

Your question doesn't really make sense, and I think it's because you don't really understand how hash tables would work, so here's some (crude and untested) code to show you how they'd work:你的问题没有意义,我认为这是因为你并不真正了解哈希表的工作原理,所以这里有一些(粗略和未经测试的)代码来向你展示它们是如何工作的:

typedef struct entry_s {
    char *key;
    char *value;
    struct entry_s *next;
} entry_t;

#define MAX_HASH 1234;

entry_t *myHashTable[MAX_HASH];


void insert(char *key, char *value);
    entry_t *entry;

    hash = calculateHash(key) % MAX_HASH;

    // Create entry

    entry = malloc(sizeof(entry_t));
    if(entry == NULL) exit(EXIT_FAILURE);
    entry->key = key;
    entry->value = value;

    // Add entry to the singly linked list for this hash

    entry->next = myHashTable[hash];
    myHashTable[hash] = entry;
}


entry *find(char *key) {
    entry_t *entry;

    hash = calculateHash(key) % MAX_HASH;
    entry = myHashTable[hash];
    while(entry != NULL) {
        if(strcmp(key, entry->key) == 0) {
            return entry;
        }
        entry = entry->next;
    }
    return NULL;
}


void delete(char *key) {
    entry_t *previous = NULL;
    entry_t *entry;

    hash = calculateHash(key) % MAX_HASH;
    entry = myHashTable[hash];
    while(entry != NULL) {
        if(strcmp(key, entry->key) == 0) {

            // Remove entry from the singly linked list for this hash

            if(previous == NULL) {
                myHashTable[hash] = entry->next;
            } else {
                previous->next = entry->next;
            }

            // Free the memory and return

            free(entry);
            return;
        }
        previous = entry;
        entry = entry->next;
    }
}

Note: A good understanding of singly linked lists will help you figure out what this example is doing.注意:对单向链表有一个很好的理解将帮助你弄清楚这个例子在做什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM