简体   繁体   English

了解C中的字符串分配

[英]Understanding String assignments in C

Okay I've read through a massive amount of of the answers here on SO, and many other places but I just can't seem to grasp this simple function. 好的,我已经在SO和其他许多地方阅读了大量的答案,但是我似乎无法掌握这个简单的功能。 Please forgive me for something so simple I haven't done c/c++ code in over 8 years and I'm very much trying to re-learn, so please have patience... 请原谅我这么简单的事情,我已经8年没有完成c / c ++代码了,我非常想重新学习,所以请耐心等待...

I've tried many different ways to do this from assigning a string through a function param by shifting in the value to just straight returning it, but nothing seems to work within the while. 我尝试了许多不同的方法来执行此操作,从通过将值移入函数参数到直接返回它来分配字符串,但是这段时间内似乎没有任何作用。 I also get no errors during compile time, but I do get segfaults at runtime. 在编译期间我也没有收到任何错误,但是在运行时确实出现了段错误。 I would very much like to find out why the following function does not work... I just don't understand why the else returns fine as type char *content, but strcat(content, line); 我非常想找出为什么以下函数不起作用...我只是不明白为什么else以char * content类型返回正常,但是strcat(content,line); does not. 才不是。 Even though the man pages for strcat shows that strcat's definition should be (char *DEST, const char *SRC). 即使strcat的手册页显示了strcat的定义应该是(char * DEST,const char * SRC)。 As I currently understand it trying to do a cast to a const char on the line variable within the while would just return an integer to the pointer. 据我目前了解,尝试在while内对line变量进行强制转换为const char只会将整数返回给指针。 So I'm stumped here and would like to be educated by those who have some time! 所以我在这里很迷茫,希望有空的人可以接受教育!

char * getPage(char *filename) {
    FILE *pFile;
    char *content;
    pFile = fopen(filename, "r");
    if (pFile != NULL) {
        syslog(LOG_INFO,"Reading from:%s",filename);
        char line [256];
        while (fgets(line, sizeof line, pFile) != NULL) {
            syslog(LOG_INFO,">>>>>>>Fail Here<<<<<<<");
            strcat(content, line);
        }
        fclose(pFile);
    } else {
        content = "<!DOCTYPE html><html lang=\"en-US\"><head><title>Test</title></head><body><h1>Does Work</h1></body></html>";
        syslog(LOG_INFO,"Reading from:%s failed, serving static response",filename);
    }
    return content;
}

Very much appreciate all the great answers in this post. 非常感谢本文中的所有出色解答。 I would give everyone in the discussion a checkmark but unfortunately I can't... 我会在讨论中给每个人一个复选标记,但不幸的是我不能...

You need to allocate memory for content . 您需要为content分配内存。 It has to be big enough for the entire file the way you are doing it. 它的大小必须足以处理整个文件。 You can either allocate a huge buffer up front and hope for the best, or allocate a smaller one and realloc it as needed. 您可以预先分配一个巨大的缓冲区并希望获得最好的缓冲区,也可以分配一个较小的缓冲区并根据需要重新分配。

Even better would be rearranging the code to avoid the need for storing the whole file all at once, although if your caller needs a whole web page as a string, that may be hard. 更好的办法是重新排列代码,以避免一次存储整个文件,尽管如果调用者需要将整个网页作为字符串,那可能会很困难。

Note also that you need to return the same type of memory from both your code paths. 还要注意,您需要从两个代码路径返回相同类型的内存。 You can't return a static string sometimes and a heap-allocated string other times. 您有时不能返回静态字符串,而在其他时候则不能返回堆分配的字符串。 That's guaranteed to call headaches and/or memory leaks. 可以肯定会造成头痛和/或内存泄漏。 So if you are copying the file contents into a block of memory, you should also copy the static string into the same type of block. 因此,如果要将文件内容复制到内存块中,则还应该将静态字符串复制到相同类型的块中。

This is pretty simple, but very surprising if you're used to a higher-level language. 这很简单,但是如果您习惯于高级语言,那就太令人惊讶了。 C does not manage memory for you , and C doesn't really have strings . C不会为您管理内存C实际上没有字符串 That content variable is a pointer, not a string. content变量是一个指针,而不是字符串。 You have to manually allocate the space you need for the string before calling strcat . 在调用strcat之前,必须手动分配字符串所需的空间。 The correct way to write this code is something like this: 编写此代码的正确方法是这样的:

FILE *fp = fopen(filename, "r");
if (!fp) {
    syslog(LOG_INFO, "failed to open %s: %s", filename, strerror(errno));
    return xstrdup("<!DOCTYPE html><html lang=\"en-US\"><head><title>Test</title>"
                  "</head><body><h1>Does Work</h1></body></html>");
} else {
    size_t capacity = 4096, offset = 0, n;
    char *content = xmalloc(capacity);
    size_t n;
    while ((n = fread(content + offset, 1, capacity - offset, fp)) > 0) {
        offset += n;
        if (offset == capacity) {
            capacity *= 2;
            content = xrealloc(content, capacity);
        }
    }
    if (n < 0)
        syslog(LOG_INFO, "read error from %s: %s", filename, strerror(errno));
    content[offset] = '\0';
    fclose(fp);
    return content;
}

Notes: 笔记:

  1. Error messages triggered by I/O failures should ALWAYS include strerror(errno) . 由I / O故障触发的错误消息应始终包含strerror(errno)
  2. xmalloc , xrealloc , and xstrdup are wrapper functions around their counterparts with no leading x ; xmallocxreallocxstrdup是它们的包装函数,没有前导x they crash the program rather than return NULL . 它们使程序崩溃而不是返回NULL This is almost always less grief than trying to recover from out-of-memory by hand in every single place where it can happen. 这几乎比尝试在可能发生的每个地方手动从内存不足中恢复要少得多。
  3. I return xstrdup("...") rather than "..." in the failed-to-open case so that the caller can always call free(content) . 在无法打开的情况下,我返回xstrdup("...")而不是"..." ,以便调用者始终可以调用free(content) Calling free on a string literal will crash your program. 在字符串文字上调用free将使程序崩溃。
  4. Gosh, that was a lot of work, wasn't it? 天哪,这是很多工作,不是吗? This is why people tend to prefer to write web apps in a higher-level language. 这就是为什么人们倾向于使用高级语言编写Web应用程序的原因。 ;-) ;-)

content is just a pointer to a string not an actual string - it has 0 bytes of space reserved for your string. content只是指向字符串的指针,而不是实际的字符串-它为您的字符串保留了0个字节的空间。 You need to allocate memory large enough to hold hour string. 您需要分配足够大的内存以容纳小时字符串。 Note that after you will have to free it 请注意,之后您将必须释放它

char *content=malloc(256);

And your code should be ok - oh and I suggest using strncat 而且您的代码应该没问题-哦,我建议使用strncat

The 2nd assignment to content worked ok before - because you are setting the pointer to point to your const string. 内容的第二个分配之前可以正常工作-因为您正在将指针设置为指向const字符串。 If you change content to a malloc'ed region of memory - then you would also want to strncpy your fixed string into content. 如果将内容更改为已分配的内存区域,那么您还希望将固定的字符串存储到内容中。

Ideally if you can use C++ std::string. 理想情况下,如果可以使用C ++ std :: string。

content is a wild pointer; content是一个野指针; the variable contains garbage, so it's pointing somewhere into left field. 该变量包含垃圾,因此它指向左侧字段中的某处。 When you copy data to it using strcat , the data goes to some random, probably bad, location. 当您使用strcat将数据复制到其中时,数据将移至某个随机位置,可能是错误的位置。 The cure for this is to make content point somewhere good. 解决的办法是使content指向一个好的地方。 Since you want it to outlive your function call, it needs to be allocated someplace besides the function's call stack. 由于您希望它的寿命超过函数调用的时间,因此需要在函数的调用堆栈之外的其他地方分配它。 You need to use malloc() to allocate some space on the heap. 您需要使用malloc()在堆上分配一些空间。 Then the caller will own the memory, and should call free() to delete it when it's no longer needed. 然后,调用方将拥有该内存,并在不再需要该内存时应调用free()将该内存删除。

You'll need to change the else part that directly assigns to content , as well, to use strcpy , so that the free() will always be valid. 您还需要更改直接分配给contentelse部分,以使用strcpy ,以便free()始终有效。 You can't free something that you didn't allocate! 您无法释放未分配的内容!

Through all of this code, make sure you remember how much space you allocated with malloc() , and don't write more data than you have space, or you'll get more crashes. 通过所有这些代码,请确保您记得使用malloc()分配了多少空间,并且没有写超过空间的数据,否则将导致更多的崩溃。

char *foo is only a pointer to some piece of memory holding the characters that form the string. char *foo只是指向某个内存的指针,该内存保存着构成字符串的字符。 So you cannot use strcat because you don't have any memory to copy to. 因此,您无法使用strcat因为您没有要复制的内存。 Inside the if statement you are allocating local memory on the stack with char line[256] that holds the line, but since that memory is local for the function is will disappear once it returns, so you cannot return line; if语句内部,您正在使用保留该行的char line[256]在堆栈上分配本地内存,但是由于该内存是该函数的本地内存,一旦返回,它将消失,因此您无法return line; .

So what you really want is to allocate some persistent memory, eg with strdup or malloc , so that you can return it from the function. 因此,您真正想要的是分配一些持久性内存,例如使用strdupmalloc ,以便您可以从函数中返回它。 Note that you cannot mix constants and allocated memory (because the user of your function must free the memory - which is only possible if it is not a constant). 请注意,您不能混合使用常量和分配的内存(因为函数的用户必须free内存-只有在不是常量的情况下才有可能)。

So you could use something like this: 因此,您可以使用如下所示的内容:

char * getPage(const char *filename) {
    FILE *pFile;
    char *content;
    pFile = fopen(filename, "r");
    if (pFile != NULL) {
        syslog(LOG_INFO,"Reading from:%s",filename);
        /* check the size and allocate memory */
        fseek(pFile, 0, SEEK_END);
        if (!(content = malloc(ftell(pfile) + 1))) { /* out of memory ... */ }
        rewind(pFile);
        /* set the content to be empty */
        *content = 0;
        char line [256];
        while (fgets(line, sizeof line, pFile) != NULL) {
            syslog(LOG_INFO,">>>>>>>Fail Here<<<<<<<");
            strcat(content, line);
        }
        fclose(pFile);
    } else {
        content = strdup("<!DOCTYPE html><html lang=\"en-US\"><head><title>Test</title></head><body><h1>Does Work</h1></body></html>");
        syslog(LOG_INFO,"Reading from:%s failed, serving static response",filename);
    }
    return content;
}

It is not the most efficient way of doing this (because strcat has to find the end every time), but the least modification of your code. 这不是执行此操作的最有效方法(因为strcat每次都必须找到结尾),但是对代码的修改最少。

An earlier answer suggested the solution: 较早的答案提出了解决方案:

char content[256];

This buffer will not be large enough to hold anything but the smallest files and the pointer content goes out of scope when return content; 该缓冲区的大小将不足以容纳任何东西,但最小的文件除外, 并且return content;时指针content超出范围return content; is executed. 被执行。 (Your earlier line, content = "static.."; is fine, because the string is placed in the .rodata data segment and its pointer will always point to the same data, for the entire lifetime of the program.) (您的前一行content = "static.."; ,因为在程序的整个生命周期中,字符串都位于.rodata 数据段中,并且其指针始终指向相同的数据。)

If you allocate the memory for content with malloc(3) , you can "grow" the space required with realloc(3) , but this introduces the potential for a horrible error -- whatever you handed the pointer to must clean up after the memory allocation when it is done with the data (or else you leak memory), and it cannot simply call free(3) because the content pointer might be to statically allocated memory. 如果使用malloc(3)content分配内存,则可以使用realloc(3) “增长”所需的空间,但这会带来潜在的可怕错误-无论您将指针交给什么,都必须在内存之后清除完成数据分配(否则会泄漏内存)时,它不能简单地调用free(3)因为content指针可能指向静态分配的内存。

So, you have two easy choices: 因此,您有两个简单的选择:

  • use strdup(3) to duplicate the static string each time you need it, and use content = malloc(size); 每次需要时都使用strdup(3) 复制静态字符串,并使用content = malloc(size); for the non-static path 用于非静态路径
  • make your caller responsible for providing the memory; 让您的呼叫者负责提供内存; every call needs to provide sufficient memory to handle either the contents of the file or the static string. 每个调用都需要提供足够的内存来处理文件内容静态字符串。

I would probably prefer the first approach, if only because the size needed for the second approach cannot be known prior to the call. 如果仅因为在调用之前无法知道第二种方法所需的大小,我可能会首选第一种方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM