简体   繁体   English

该功能仅适用于无用的printf

[英]The function only works with a useless printf

I usually try hard and harder to solve myself any bugs I find in my code, but this one is totally out of any logic for me. 我通常会努力解决自己在代码中发现的任何错误,但这个错误对我来说完全没有任何逻辑。 It works really fine with whatever strings and char separators, but only with that useless printf inside the while of the function, otherwise it prints 它的工作原理与任何字符串和字符分隔符,但只与无用真的没事printf里面while该功能,否则它打印

-> Lorem

then 然后

-> ▼

and crashes aftwerwards. 然后坠毁了。 Thanks in advance to anyone that could tell me what is happening. 提前感谢任何可以告诉我发生了什么的人。

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <stdint.h>

char **strsep_(char *str, char ch) {
    // Sub-string length
    uint8_t len = 0;
    // The number of sub-strings found means the same as the position where it will be stored in the main pointer
    // Obviously, the number tends to increase over time, and at the end of the algorithm, it means the main pointer length too
    uint8_t pos = 0;
    // Storage for any found sub-strings and one more byte as the pointer is null-terminated
    char **arr = (char**)malloc(sizeof(char **) + 1);
    while (*str) {
        printf("Erase me and it will not work! :)\n");
        if (*str == ch) {
            // The allocated memory should be one step ahead of the current usage
            arr = realloc(arr, sizeof(char **) * pos + 1);
            // Allocates enough memory in the current main pointer position and the '\0' byte
            arr[pos] = malloc(sizeof(char *) * len + 1);
            // Copies the sub-string size (based in the length number) into the previously allocated space
            memcpy(arr[pos], (str - len), len);
            // `-_("")_-k
            arr[pos][len] = '\0';
            len = 0;
            pos++;
        } else {
            len++;
        }
        *str++;
    }
    // Is not needed to reallocate additional memory if no separator character was found
    if (pos > 0) arr = realloc(arr, sizeof(char **) * pos + 1);
    // The last chunk of characters after the last separator character is properly allocated
    arr[pos] = malloc(sizeof(char *) * len + 1);
    memcpy(arr[pos], (str - len), len);
    // To prevent undefined behavior while iterating over the pointer
    arr[++pos] = NULL;

    return arr;
}

void strsep_free_(char **arr) {
    char **aux = arr;
    while (*arr) {
        free(*arr);
        *arr = NULL;
        arr++;
    }
    // One more time to fully deallocate the null-terminated pointer
    free(*arr);
    *arr = NULL;
    arr++;
    // Clearing The pointer itself 
    free(aux);
    aux = NULL;
}

int main(void) {
    char **s = strsep_("Lorem ipsum four words", ' ');
    char **i = s;
    while (*i != NULL) {
        printf("-> %s\n", *i);
        i++;
    }
    strsep_free_(s);
}

The probable reason for the crash is most likely this: realloc(arr, sizeof(char **) * pos + 1) . 崩溃的可能原因很可能就是: realloc(arr, sizeof(char **) * pos + 1)

That is the same as realloc(arr, (sizeof(char **) * pos) + 1) which does not allocate enough space for your "array". 这与realloc(arr, (sizeof(char **) * pos) + 1) ,它没有为“数组”分配足够的空间。 You need to do realloc(arr, sizeof(char **) * (pos + 1)) . 你需要做realloc(arr, sizeof(char **) * (pos + 1))

Same with the allocation for arr[pos] , you need to use parentheses correctly there too. arr[pos]的分配相同,您也需要正确使用括号。

Your program has undefined behavior, which means it may behave in unexpected ways, but could by chance behave as expected. 您的程序具有未定义的行为,这意味着它可能以意想不到的方式运行,但可能会按预期运行。 Adding the extra printf changes the behavior in a way the seems to correct the bug, but only by coincidence. 添加额外的printf会以似乎纠正错误的方式改变行为,但只是巧合。 On a different machine, or even on the same machine at a different time, the behavior may again change. 在不同的机器上,或者甚至在不同时间在同一台机器上,行为可能会再次发生变化。

There are multiple bugs in your program that lead to undefined behavior: 程序中存在多个导致未定义行为的错误:

  • You are not allocating the array with the proper size: it should have space fpr pos + 1 pointers, hence sizeof(char **) * (pos + 1) . 你没有分配具有适当大小的数组:它应该有空格fpr pos + 1指针,因此sizeof(char **) * (pos + 1) The faulty statements are: char **arr = (char**)malloc(sizeof(char **) + 1); 错误的陈述是: char **arr = (char**)malloc(sizeof(char **) + 1); and arr = realloc(arr, sizeof(char **) * pos + 1); arr = realloc(arr, sizeof(char **) * pos + 1); .

  • Furthermore, the space allocated for each substring is incorrect too: arr[pos] = malloc(sizeof(char *) * len + 1); 此外,为每个子字符串分配的空间也是不正确的: arr[pos] = malloc(sizeof(char *) * len + 1); should read arr[pos] = malloc(sizeof(char) * len + 1); 应该读取arr[pos] = malloc(sizeof(char) * len + 1); , which by definition is arr[pos] = malloc(len + 1); ,根据定义,它是arr[pos] = malloc(len + 1); . This does not lead to undefined behavior, you just allocate too much memory. 这不会导致未定义的行为,您只需分配太多内存。 If your system supports it, allocation and copy can be combined in one call to strndup(str - len, len) . 如果您的系统支持它,可以在一次调用strndup(str - len, len)组合分配和复制。

  • You never check for memory allocation failure, causing undefined behavior in case of memory allocation failure. 您永远不会检查内存分配失败,在内存分配失败的情况下导致未定义的行为。

  • Using uint8_t for len and pos is risky: what if the number of substrings exceeds 255? uint8_t用于lenpos是有风险的:如果子串数超过255怎么办? pos and len would silently wrap back to 0 , producing unexpected results and memory leaks. poslen会默默地回len 0 ,产生意外的结果和内存泄漏。 There is no advantage at using such a small type, use int or size_t instead. 使用这种小类型没有任何优势,请使用intsize_t

Here is a corrected version: 这是一个更正版本:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char **strsep_(const char *str, char ch) {
    // Sub-string length
    int len = 0;
    // The number of sub-strings found, index where to store the NULL at the end of the array.
    int pos = 0;
    // return value: array of pointers to substrings with an extra slot for a NULL terminator.
    char **arr = (char**)malloc(sizeof(*arr) * (pos + 1));
    if (arr == NULL)
        return NULL;
    for (;;) {
        if (*str == ch || *str == '\0') {
            // alocate the substring and reallocate the array
            char *p = malloc(len + 1);
            char **new_arr = realloc(arr, sizeof(*arr) * (pos + 2));
            if (new_arr == NULL || p == NULL) {
                // allocation failure: free the memory allocated so far
                free(p);
                if (new_arr)
                    arr = new_arr;
                while (pos-- > 0)
                    free(arr[pos]);
                free(arr);
                return NULL;
            }
            arr = new_arr;
            memcpy(p, str - len, len);
            p[len] = '\0';
            arr[pos] = p;
            pos++;
            len = 0;
            if (*str == '\0')
                break;
        } else {
            len++;
        }
        str++;
    }
    arr[pos] = NULL;
    return arr;
}

void strsep_free_(char **arr) {
    int i;
    // Free the array elements 
    for (i = 0; arr[i] != NULL; i++) {
        free(arr[i]);
        arr[i] = NULL;  // extra safety, not really needed
    }
    // Free The array itself 
    free(arr);
}

int main(void) {
    char **s = strsep_("Lorem ipsum four words", ' ');
    int i;
    for (i = 0; s[i] != NULL; i++) {
        printf("-> %s\n", s[i]);
    }
    strsep_free_(s);
    return 0;
}

Output: 输出:

-> Lorem
-> ipsum
-> four
-> words

Good answer from @chqrlie. @chqrlie给出了很好的答案。 From my side, I think it would be better to count everything before copy, it should help to avoid realloc. 从我的角度来看,我认为在复制之前计算所有内容会更好,这应该有助于避免重新分配。

#include <string.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

int count_chars(const char *str, const char ch)
{
    int i;
    int count;

    i = 0;
    count = 0;
    if (*str == ch)
        str++;

    while (str[i] != ch && str[i] != '\0')
    {
        count++;
        i++;
    }

    return (count);
}

int count_delimeter(const char *str, const char ch)
{
    int i = 0;
    int count = 0;

    while (str[i])
    {
        if (str[i] == ch && str[i + 1] != ch)
            count++;
        i++;
    }

    return count;
}

char** strsep_(const char *str, const char ch)
{
    char **arr;
    int index = 0;
    int size = 0;
    int i = 0;

    size = count_delimeter(str, ch) + 1;

    if ((arr = malloc(sizeof(char *) * (size + 1))) == NULL)
        return (NULL);
    arr[size] = NULL;

    while (i < size)
    {
        if (str[index] == ch)
            index++;

        if (str[index] && str[index] == ch && str[index + 1] == ch)
        {
            while (str[index] && str[index] == ch && str[index + 1] == ch)
                index++;
            index++;
        }

        int len = count_chars(&str[index], ch);
        if ((arr[i] = malloc(sizeof(char) * (len + 1))) == NULL)
            return NULL;

        memcpy(arr[i], &str[index], len);
        index += len;
        arr[i++][len] = '\0';
    }

    return arr;
}

int main(void)
{
    char *str = "Lorem   ipsum  ipsum Lorem lipsum gorem insum";
    char **s = strsep_(str, ' ');
    /* char *str = "Lorem + Ipsum"; */
    /* char **s = strsep_(str, '+'); */
    /* char *str = "lorem, torem, horem, lorem"; */
    /* char **s = strsep_(str, ','); */
    while (*s != NULL) {
        printf("-> [%s]\n", *s);
        s++;
    }

    /* dont forget to free */
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM