简体   繁体   English

多个fscanf

[英]Multiple fscanf

I have written the following program that is intended to read a string from a file into variable "title": 我已经编写了以下程序,旨在将文件中的字符串读取到变量“ title”中:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int m, b;
    char *title;
    FILE *fp;

    fp = fopen("input2.txt", "r");
    if (fp == NULL)
    {
         printf ("Error: file cannot be found\n");
         return 1;
    }

    fscanf(fp, "<%d>\n<%d>", &m, &b);
    printf("%d\n%d", m, b);
    fscanf(fp, "<%s>", title);

    fclose(fp);
    return 0;
}

The above program crashes at the second call to fscanf . 上面的程序在第二次调用fscanf崩溃。 Why does this happen? 为什么会这样?

Your main problem is that you've not allocated space for the string to be read into. 您的主要问题是您尚未为读取的字符串分配空间。 You can do this in multiple ways: 您可以通过多种方式执行此操作:

char title[256];

or: 要么:

char *title = malloc(256);
if (title == NULL)
{
    fprintf(stderr, "Out of memory\n");
    exit(1);
}

either of which should then be used with: 然后应将其中任何一个与:

if (fscanf(fp, " <%255[^>]>", title) != 1)
{
    fprintf(stderr, "Oops: format error\n");
    exit(1);
}

or, if you have a system with an implementation of fscanf() that's compliant with POSIX 2008, you can use the m modifier to %s (or with %c , or, in this case, a scanset %[...] — more on that below): 或者,如果您的系统具有与POSIX 2008兼容的fscanf()实现,则可以将m修饰符用于%s (或使用%c ,在这种情况下,还可以使用扫描集%[...] —下面的更多内容):

char *title = 0;

if (fscanf(fp, " <%m[^>]>", &title) != 1)  // Note the crucial &
{
    fprintf(stderr, "Oops: format error\n");
    exit(1);
}

This way, if the fscanf() succeeds in its entirety, the function will allocate the memory for the title. 这样,如果fscanf()整体成功,函数将为标题分配内存。 If it fails, the memory will have been released (or never assigned). 如果失败,则内存将被释放(或从未分配)。

Note that I changed %s to %m[^>] . 请注意,我将%s更改为%m[^>] This is necessary because the original conversions will never match the > . 这是必要的,因为原始转换永远不会与>匹配。 If there is a > in the input, it will be incorporated into the result string because that reads up to white space, and > is not white space. 如果输入中有一个> ,它将被合并到结果字符串中,因为它最多读取空格,而>不是空格。 Further, you won't be able to tell whether the trailing context was ever matched — that's the > in the original format, and it's still a problem (or not) in the revised code I'm suggesting. 此外,您将无法判断尾随上下文是否曾经匹配过-这是原始格式中的> ,在我建议的修订代码中,这还是一个问题(是否存在)。

I also added a space at the start of the string to match optional white space. 我还在字符串的开头添加了一个空格,以匹配可选的空白。 Without that, the < at the start of the string must be on the same line as the > after the second number, assuming that the > is present at all. 否则,假设根本没有> ,则字符串开头的<必须与第二个数字之后的>位于同一行。 You should also check the return from the first fscanf() : 您还应该检查第一个fscanf()的返回值:

if (fscanf(fp, "<%d>\n<%d>", &m, &b) != 2)
{
    fprintf(stderr, "Oops: format error\n");
    exit(1);
}

Note that the embedded newline simply looks for white space between the > and the < — that's zero or more blanks, tabs or newlines. 请注意,嵌入的换行符只是在><之间寻找空格,即零个或多个空格,制表符或换行符。 Also note that you'll never know whether the second > was matched or not. 还要注意,您将永远不会知道第二个>是否匹配。

You could use exit(EXIT_FAILURE); 您可以使用exit(EXIT_FAILURE); in place of exit(1); 代替exit(1); — or, since this code is in main() , you could use either return 1; —或者,因为此代码在main() ,所以您可以使用return 1; or return(EXIT_FAILURE); return(EXIT_FAILURE); where the parentheses are optional in either case but their presence evokes unwarranted ire in some people. 括号在两种情况下都是可选的,但在某些情况下它们的出现会引起不必要的怒火。

You could also improve the error messages. 您还可以改善错误消息。 And you should consider using fgets() or POSIX's getline() followed by sscanf() because it makes it easier (by far) to do good error reporting, plus you can rescan the data easily if the first attempt at converting it fails. 而且,您应该考虑使用fgets()或POSIX的getline()然后使用sscanf()因为(到目前为止)这样做可以更轻松地进行良好的错误报告,而且,如果首次尝试转换数据失败,则可以轻松地重新扫描数据。

This: 这个:

    char *title;

is just a pointer to a char. 只是一个字符的指针。 If fscanf writes more than one character to it, you will corrupt whatever happens to be in memory after 如果fscanf向其中写入了多个字符,则在运行完之后,您将破坏存储器中的任何内容

You need to do one of two things: 您需要执行以下两项操作之一:

    char title[50]; // Holds up to 49 characters, plus termination

Or: 要么:

    #include <stdlib.h>
    // ...
    char *title = malloc(50 * sizeof(char)); // Same capacity as above
    if (title == NULL) {
        // handle out of mem error
    }
    // ...
    free (title);

The first option is obviously much simpler, but requires you to know your array size at compile time. 第一个选项显然要简单得多,但是需要您在编译时知道数组大小。

If you are new to programming, and haven't encountered pointers and dynamic memory allocation yet, stick with the first option for now. 如果您不熟悉编程,还没有遇到指针和动态内存分配,请暂时坚持使用第一个选项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM