简体   繁体   English

如何使用fscanf()格式字符串

[英]How to use fscanf() format string

I am using fscanf() to read input from a file (I know I should be using fgets() but I'm not allowed) and I can't figure out how to use the format string right. 我正在使用fscanf()从文件中读取输入(我知道我应该使用fgets(),但不允许这样做),而且我不知道如何正确使用格式字符串。

The input is in the format: M 03f8ab8,1 输入格式为:M 03f8ab8,1

I need the letter, address, and the number to each be saved to a variable. 我需要将字母,地址和数字分别保存到一个变量中。 Here is what I've got so far: 这是到目前为止我得到的:

while(fscanf(file, " %s %s, %d", operation, address, &size) != -1)

As written, it puts the letter into the correct var(operation), but adds ,number to the end of the address and then assigns something undefined to the size variable. 如所写,它将字母放入正确的var(operation)中,但在地址末尾添加,number,然后将未定义的内容分配给size变量。

It should put each into its own respective variable (and ignore the comma) 它应该将每个变量放到各自的变量中(并忽略逗号)

How do I set up fscanf() to get this correctly? 如何设置fscanf()才能正确获取此信息?

The problem here is that the "%s" format reads space delimited strings, and since there's no space in 03f8ab8,1 it will all be read as a single string. 这里的问题是"%s"格式读取以空格分隔的字符串,并且由于03f8ab8,1没有空格,因此它将全部作为单个字符串读取。

You can solve this with the "%[" format, which allows you to have some very simple pattern matching. 您可以使用"%["格式解决此问题,该格式允许您进行一些非常简单的模式匹配。 You can for example use it to tell fscanf to read everything until (but not including) a comma. 例如,您可以使用它告诉fscanf读取所有内容,直到(但不包括)逗号为止。 Like 喜欢

fscanf(file, "%s %[^,], %d", operation, address, &size)

See eg this scanf (and family) reference for more details. 有关更多详细信息,请参见例如scanf (和系列)参考

Also, you shouldn't be comparing the result of fscanf with -1 , instead check that it parsed the correct number of sequences by comparing the return with 3 : 另外,您不应该将fscanf的结果与-1 ,而是通过将return与3进行比较来检查它是否解析了正确的序列数:

while (fscanf(file, "%s %[^,], %d", operation, address, &size) == 3) ...

Note that the above format will not impose any limits on the strings it will read. 请注意,以上格式不会对其读取的字符串施加任何限制。 That can lead to overflow of your strings. 这可能会导致字符串溢出。 If your strings is of a fixed size (ie they're arrays) then use the format maximum field width to limit the number of characters that fscanf will read and put into your array. 如果您的字符串是固定大小的(即,它们是数组),则使用格式最大字段宽度来限制fscanf将读取并放入数组的字符数。

For example (without knowing anything at all about your actual strings/arrays): 例如(完全不了解您的实际字符串/数组):

while (fscanf(file, "%1s %8[^,], %d", operation, address, &size) == 3) ...

With the above, the first string can't be longer than one single character, and the second can't be longer than eight characters. 使用上述方法,第一个字符串不能超过一个字符,第二个字符串不能超过八个字符。 Note that these number do not include the string null-terminator (which you need space for in your arrays beyond the size given above). 请注意,这些数字包括字符串null终止符(您在数组中需要的空间超出了上面给出的大小)。

fscanf(input_fp, "%30[^ ,\n\t]%30[^ ,\n\t]%30[^ ,\n\t]", ...

does not consume the ',' nor the '\\n' in the text file. 不会在文本文件中使用','或'\\ n'。 Subsequent fscanf() attempts also fail and return a value of 0, which not being EOF, causes an infinite loop. 随后的fscanf()尝试也会失败,并返回值0(不是EOF)会导致无限循环。


fscanf() solution, to a fgets()/sscanf() better handles potential IO and parsing errors: 对于fgets()/sscanf() fscanf()解决方案可以更好地处理潜在的IO和解析错误:

main()
{
    FILE *input_fp;
    FILE *output_fp;
    char buf[100];
    while (fgets(buf, sizeof buf, input_fp) != NULL) 
    {
      char name[30];  // Insure this size is 1 more than the width in scanf format.
      char age_array[30];
      char occupation[30];
      #define VFMT " %29[^ ,\n\t]"
      int n;  // Use to check for trailing junk

      if (3 == sscanf(buf, VFMT "," VFMT "," VFMT " %n", 
          name, age_array, occupation, &n) && buf[n] == '\0') 
      {
        // Suspect OP really wants this width to be 1 more
        if (fprintf(output_fp, "%-30s%-30s%-30s\n", name, age_array, occupation) < 0)
          break;
      } else
        break;  // format error
    }
    fclose(input_fp);
    fclose(output_fp);
}

Rather than call ferror(), check return values of fgets(), fprintf(). 而不是调用ferror(),而是检查fgets(),fprintf()的返回值。

Suspect OP's undeclared field buffers were [30] and adjusted scanf() accordingly. 可疑OP的未声明字段缓冲区为[30],并相应地调整了scanf()。

Details about if (3 == sscanf(buf, VFMT "," ... 有关if (3 == sscanf(buf, VFMT "," ...

The if (3 == sscanf(...) && buf[n] == '\\0') { becomes true when: if (3 == sscanf(...) && buf[n] == '\\0') {在以下情况下为真:

1) exactly the 3 "%29[^ ,\\n\\t]" format specifiers each scanf in at least 1 char each. 1)精确地使用3个"%29[^ ,\\n\\t]"格式说明符,每个scanf至少使用1个字符。

2) buf[n] is the end of the string. 2)buf [n]是字符串的结尾。 n is set via the "%n" specifier. n通过“%n”说明符设置。 The preceding ' ' in " %n" causes any following white-space after the last "%29[^ ,\\n\\t]" to be consumed. “%n”中的前一个''导致最后一个"%29[^ ,\\n\\t]"之后的任何后续空格被消耗。 scanf() sees "%n", which directs it to set the current offset from the beginning of scanning to be assign to the int pointed to by &n. scanf()看到“%n”,它指示它设置从扫描开始的当前偏移量,以将其分配给&n所指向的int。

"VFMT "," VFMT "," VFMT " %n" is concatenated by the compiler to 编译器将"VFMT "," VFMT "," VFMT " %n"连接到

" %29[^ ,\n\t], %29[^ ,\n\t], %29[^ ,\n\t] %n".

I find the former easier to maintain than the latter. 我发现前者比后者更易于维护。

The first space in " %29[^ ,\\n\\t]" directs sscanf() to scan over (consume and not save) 0 or more white-spaces (' ', '\\t', '\\n', etc.). " %29[^ ,\\n\\t]"的第一个空格指示sscanf()扫描(消耗并不保存)0个或多个空格('','\\ t','\\ n'等)。 The rest directs sscanf() to consume and save any 1 to 29 char except ',', '\\n', '\\t', then append a '\\0'. 其余的命令sscanf()消耗并保存除“,”,“ \\ n”,“ \\ t”之外的任何1到29个字符,然后附加一个“ \\ 0”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM