简体   繁体   English

有人可以帮我找到段错误吗?

[英]Can someone help me find the segfault here?

EDIT: So, it turns out that 'index' was not being returned to 0. Well then. 编辑:所以,事实证明'索引'没有返回到0.那么。 That fixed one segfault. 这修复了一个段错误。 But still getting a different segfault. 但仍然得到一个不同的段错误。 Working on it. 正在努力。

node* new_node(void){
    node* ptr = malloc(sizeof(node));
    for (int i = 0; i<27; i++) {
        ptr->next[i] = NULL;
    }
    return ptr;
}
bool load(const char* dictionary)
{
    FILE* dict = fopen(dictionary, "r");
    node* ptr = new_node;
    char word[LENGTH+1];
    int index = 0;
    for (int c = fgetc(dict); c!=EOF; c = fgetc(dict)){
        if(c!='\n'){
            word[index]=c;
            index++;
        }
        else {
            for(int x=0; x<=index; x++){
                int ch = (word[x] == '\'') ? 26 : tolower(word[x])-'a';
                if (ptr->next[ch] == NULL){
                    ptr->next[ch] = new_node;
                }
                ptr = ptr->next[ch];
            }
            ptr->end=true;
        }
    }
    return true;
}

I'm trying to implement a trie data structure for a dictionary but my program seems to segfault somewhere in this function. 我正在尝试为字典实现trie数据结构,但我的程序似乎在这个函数中的某个地方发生了段错误。 I can't seem to pin it down even with the help of GDB, so can someone give me a hand? 即使在GDB的帮助下,我也似乎无法确定它,所以有人可以帮我一把吗?

Node is defined as such: 节点定义如下:

typedef struct node{
    bool end;
    struct node* next[27];
} node;

Dictionary file: 字典文件:

a
aaa
aaas
aachen
aalborg
aalesund
aardvark
aardvark's
aardvarks
aardwolf

(...) (......)

You have many issues in your code: 您的代码中存在许多问题:

  • When you allocate memory with malloc , it is uninitialised. 使用malloc分配内存时,它是未初始化的。 initialise it directly after allocating it, so that NULL pointers really are null. 在分配之后直接初始化它,以便NULL指针确实为空。 ( calloc , a cousin of ´malloc´, initialises all memory to zero.) calloc ,'malloc'的表兄弟,将所有内存初始化为零。)

  • When you loop over the word, you should nor include index : 当你循环单词时,你应该也不包括index

     for (int x = 0; x < index; x++) ... 
  • When you have found the end of a word, you must reset the index to 0. Otherwise, you will append to the old word and overflow the buffer. 找到单词结尾后,必须将index重置为0.否则,将附加到旧单词并溢出缓冲区。 (You should probably also enforce the upper bound of ´index´.) (您可能还应该强制执行'index'的上限。)

  • Likewise, when you insert a word into the trie, you must reset your pointer for trie traversal to the trie's root. 同样,当你在trie中插入一个单词时,你必须重置你的指针以便trie遍历到trie的根目录。 You need two pointers here: A root node pointer and an auxiliary pointer for traversing the trie. 这里需要两个指针:一个根节点指针和一个用于遍历trie的辅助指针。

  • As is, your trie is local to your function. 因此,您的trie是您的功能的本地。 Return the root node, so that other functions can use the trie, or NULL on failure. 返回根节点,以便其他函数可以使用trie,或者在失败时使用NULL

Fix these, and you will have a non-crashing function. 修复这些,你将有一个非崩溃的功能。 (It still leaks memory and may not construct the trie properly.) (它仍会泄漏内存,可能无法正确构建trie。)

    node *load(const char *dictionary)
    {
        FILE *dict = fopen(dictionary, "r");
        node *head = calloc(1, sizeof(node));

        char word[LENGTH + 1];
        int index = 0;

        for (int c = fgetc(dict); c != EOF; c = fgetc(dict)) {
            if (c != '\n') {
                word[index] = c;
                index++;
            } else {
                node *ptr = head;

                for (int x = 0; x < index; x++) {
                    int ch = (word[x] == '\'') ? 26 : tolower(word[x]) - 'a';
                    if (ptr->next[ch] == NULL) {
                        ptr->next[ch] = calloc(1, sizeof(node));
                    }
                    ptr = ptr->next[ch];
                }
                ptr->end = true;
                index = 0;
            }
        }

        return head;
    }

You forgot to reset index to 0 at the beginning of the loop. 您忘记在循环开始时将index重置为0

You should also use calloc(1, sizeof(node)) instead of malloc(sizeof(node)) to avoid leaving memory uninitialized. 您还应该使用calloc(1, sizeof(node))而不是malloc(sizeof(node))来避免内存未初始化。 I suggest you use valgrind to help you track problems of this kind in your code. 我建议您使用valgrind来帮助您在代码中跟踪此类问题。

The line: 这条线:

node* ptr = new_node;

and

ptr->next[ch] = new_node;

are not calling the function, but assigning the address of the function to ptr . 不调用函数,而是将函数的地址分配给ptr Call the function instead. 请改为调用该函数。

This problem could have been prevented if compiler warnings: -Wall and -Wextra were enabled. 如果编译器警告: -Wall-Wextra已启用,则可以防止此问题。


There is no bounds checking done on the array word . 对阵列word没有边界检查。 Use the value LENGTH to check if the index is in bounds before using it. 在使用之前,使用值LENGTH检查索引是否在边界内。

It isn't clear what the if statement inside the for loop is doing. 目前尚不清楚for循环中的if语句是做什么的。 It appears that every time a newline is found the whole array word is added to the tree, but the index isn't reset so the same array is added multiple times. 看起来,每次找到换行符时,整个数组word都会添加到树中,但index不会被重置,因此会多次添加相同的数组。 At some point index will point out of bounds causing undefined behavior. 在某些时候, index将指出超出范围导致未定义的行为。 You should reset index after you use the array word . 您应该在使用数组word后重置index

You should filter punctuation\\unsupported characters a bit more. 您应该更多地过滤标点符号\\不支持的字符。 Any character outside of [az|AZ|\\n|\\\\] will crash your program because of 由于[az|AZ|\\n|\\\\]之外的任何字符都会导致程序崩溃

int ch = (word[x] == '\'') ? 26 : tolower(word[x])-'a';
if (ptr->next[ch] == NULL){

Given that you open a file, there might be a space somewhere or some unexpected character. 鉴于您打开一个文件,某个地方可能存在某个空格或某些意外的字符。 You need something like 你需要类似的东西

    if(c!='\n'){
        int num = (c == '\'') ? 26 : tolower(c)-'a');
        if(num >=0 && num < 27)
        {
           word[index]=c;
           index++;
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM