简体   繁体   English

这个c函数有什么问题?

[英]what's wrong with this c function?

The following function has to split a line in 2 or more lines each of which is shorter than s. 以下功能必须将一条线分成2条或更多条线,每条线均短于s。

char **splitline(FILE *fp, int s)
{
    char **l;
    char c;
    int ccounter;
    int lcounter;

    c = fgetc(fp);
    if (c == EOF)
    {
        return NULL;
    }

    lcounter=0;
    l = malloc(sizeof(char **));
    l[lcounter] = malloc((SIZE+2)*sizeof(char));

    ccounter = 0;
    while (c != EOF && c != '\n')
    {
        l[lcounter][ccounter] = c;
        ccounter++;
        c = fgetc(fp);

        if (ccounter == SIZE)
        {
            l[lcounter][ccounter] = '\n';
            ccounter++;
            l[lcounter][ccounter] = '\0';

            realloc(l, (lcounter+2) * sizeof(char **));

            lcounter++;

            l[lcounter] = malloc((SIZE+2) * sizeof(char));
            ccounter = 0;
        }
    }

    if (ccounter == 0)
    {
        l[lcounter][ccounter] = '\0'; 
    }
    else
    {
        l[lcounter][ccounter] = '\n';
        ccounter++;
        l[lcounter][ccounter] = '\0';

        realloc(l, (lcounter+2) * sizeof(char **));

        lcounter++;

        ccounter = 0;
        l[lcounter] = malloc((SIZE+2) * sizeof(char));
        l[lcounter][ccounter] = '\0';
    }

    return l;
}
  • You are using an undefined constant SIZE instead of the argument s to your function to control the maximum lengths of your lines. 您正在使用未定义的常量SIZE代替函数的参数s来控制行的最大长度。 Since s is a signed integer, you should check it for sanity (it should be positive (non-zero) number, maybe not larger than 1 MiB; maybe you want to set a bigger lower limit than 1; maybe you default to 80 if the caller screws up, or maybe you return with an error). 由于s是一个带符号的整数,因此应检查其是否合理(它应为正(非零)数字,可能不大于1 MiB;也许要设置一个大于1的下限;如果满足以下条件,则默认为80来电者搞砸了,否则您可能会返回错误)。

  • You are using char c; 您正在使用char c; to save the value read with fgetc() ; 保存用fgetc()读取的值; sadly, that means your EOF test is unreliable. 可悲的是,这意味着您的EOF测试不可靠。 Either you'll stop prematurely when someone supplies 'ÿ' (y-umlaut, hex 0xFF in ISO 8859-1, 8859-15, or U+00FF in ISO 10646 - Unicode) or you won't stop at all, depending on whether the type char is signed or unsigned. 当有人提供“ÿ”(y-umlaut,ISO 8859-1中的十六进制0xFF,8859-15或ISO 10646-Unicode中的U + 00FF)时,您可能会过早停止,或者根本不会停止,具体取决于char类型是带符号的还是无符号的。 Always remember: getchar() and relatives return an int ! 永远记住: getchar()和亲戚返回一个int

  • A variable l is inviting confusion with the constant 1 ; 变量l引起与常数1混淆; generally avoid it. 一般避免。

  • Your main loop condition would be better if you tested the result of fgetc() directly. 如果直接测试fgetc()的结果,则主循环条件会更好。

     int c; while ((c = fgetc(fp)) != EOF && c != '\\n') { ... } 

    As it is, you read a character part way through the loop and don't check it properly. 照原样,您在循环中读取了一部分字符,并且没有正确检查它。 You might then use ungetc() to push the character first read back into the input stream; 然后,您可以使用ungetc()将首先读取的字符推回输入流; it makes the input processing more uniform. 它使输入处理更加统一。 Alternatively, you might set things up so that everything works correctly if the first call to fgetc() is in the loop control and it returns EOF. 或者,您可以进行设置,以便在循环控件中首次调用fgetc() 返回EOF时,一切都可以正常工作。

  • As Tommy pointed out, you must capture the output of realloc() ; 正如Tommy所指出的,您必须捕获realloc()的输出; there is no guarantee that it will return its input pointer as the result. 无法保证它将返回其输入指针作为结果。 You should also learn now that you do not save the result of realloc() into the variable specified as its first argument. 您现在还应该了解,不要将realloc()的结果保存到指定为第一个参数的变量中。 You immediately leak memory if you do and the reallocation fails (because you've lost the pointer to the old memory - it just got zeroed). 如果这样做,您将立即泄漏内存,并且重新分配失败(因为您丢失了指向旧内存的指针-它刚刚归零)。 So, to be safe, you use: 因此,为了安全起见,请使用:

     char **new_array = realloc(l, (lcounter+1) * sizeof(*new_array)); if (new_array == 0) ...handle out of memory... else l = new_array; 

    A couple of points here. 这里有几点。 You allocated (lcounter+2) values, but I think you only ever use one of them (unless there's a terminal null pointer to mark the end of the array). 您分配了(lcounter+2)值,但我认为您只能使用其中一个(除非有一个终端空指针来标记数组的末尾)。 You specified sizeof(char **) but actually you want an array of char * values. 您指定了sizeof(char **)但实际上您想要一个char *值的数组。 Fortunately, all (object) pointers are the same size - the C standard guarantees that; 幸运的是,所有(对象)指针的大小都相同-C标准保证了这一点; only POSIX guarantees that function pointers are the same size as object pointers (the C standard does not). 只有POSIX才能保证函数指针的大小与对象指针的大小相同(C标准没有)。 So, sizeof(char **) == sizeof(char *) and you are safe, but you are not asking for what you wanted. 因此, sizeof(char **) == sizeof(char *)可以确保您的安全,但是您并没有在问自己想要什么。

  • A corollary to the discussion about realloc() possibly failing is malloc() may fail too. 关于realloc()可能失败的讨论的必然结果是malloc()也可能失败。 You should error check your memory allocation - or use a set of cover functions for the standard library that only return if the pointer that is returned is not null. 您应该对您的内存分配进行错误检查-或为仅在返回的指针不为null时返回的标准库使用一组Cover函数。 If you don't check it, your program will crash eventually because of a memory allocation failure - even on a machine with 24 GiB of main memory (though it might take a while to get to that point). 如果不检查它,则程序最终将由于内存分配失败而崩溃-即使在主内存为24 GiB的计算机上(尽管要花点时间才能达到目的)。

  • There is a lot of repetition in the code. 代码中有很多重复。 You should look to avoid that. 您应该避免这种情况。 It means using a sub-function, perhaps, to manage the memory allocations. 这意味着可能使用子功能来管理内存分配。

If you fix those, you're in with a fighting chance of getting a working function. 如果您修复了这些问题,则很有可能获得工作功能。 You should also write the code to release the allocated memory that you return, so that the users doesn't have to devise a method to do so on your behalf. 您还应该编写代码以释放返回的已分配内存,以便用户不必设计一种方法来代表您这样做。 Always worry about who is going to release allocated memory and how it will be released. 始终担心谁将释放分配的内存以及如何释放它。


Chris Lutz asked: 克里斯·卢茨(Chris Lutz)问:

And where does the standard guarantee that all object pointer types are the same size? 标准在哪里保证所有对象指针类型的大小都相同?

I believe the relevant section of the (C99 - ISO/IEC 9899:1999) standard is: 我相信(C99-ISO / IEC 9899:1999)标准的相关部分是:

§6.3.2.3 Pointers §6.3.2.3指针

1 A pointer to void may be converted to or from a pointer to any incomplete or object type. 1指向void的指针可以转换为任何不完整或对象类型的指针或从该指针转换为任何不完整或对象类型的指针。 A pointer to any incomplete or object type may be converted to a pointer to void and back again; 指向任何不完整或对象类型的指针都可以转换为指向void的指针并再次返回。 the result shall compare equal to the original pointer. 结果应等于原始指针。

That basically says all object pointer types can be converted to void pointers and back again without loss of information, which means that they must all be the same size. 从根本上说,所有对象指针类型都可以转换为空指针,然后又可以返回,而不会丢失信息,这意味着它们必须都具有相同的大小。 Note that the category 'object pointers' does not include 'function pointers'. 请注意,类别“对象指针”不包括“功能指针”。

The relevant section of the POSIX standard (about function pointers) is §2.12.3 (at the bottom of the linked section). POSIX标准的相关部分(关于函数指针)是§2.12.3(在链接部分的底部)。

The main one I can spot: realloc may not keep your buffer at the same address. 我可以发现的主要问题是:重新分配可能不会使缓冲区保持在同一地址。 You should use 你应该用

 l = realloc(...)

Actually, see the comment of Matteo below. 实际上,请参阅下面的Matteo评论。 Realloc may do one of three things: 重新分配可能会执行以下三种操作之一:

  • extend the size of the buffer you already have (in which case you'll get the same pointer back); 扩展已经拥有的缓冲区的大小(在这种情况下,您将获得相同的指针);
  • allocate a new buffer elsewhere, copy your buffer's contents across and free the original (in which case you'll get a different pointer back); 在其他地方分配一个新的缓冲区,跨缓冲区复制内容并释放原始缓冲区(在这种情况下,您将获得另一个指针);
  • find itself unable to get you a buffer of the required size anywhere and fail, leaving your current contents intact (returning NULL). 发现自身无法在任何地方为您提供所需大小的缓冲区并失败,从而使当前内容保持不变(返回NULL)。

In the final case, assignment without checking the return value would cause a memory leak. 在最后的情况下,不检查返回值的分配将导致内存泄漏。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM