简体   繁体   English

使用fgets和sscanf的意外重新填充

[英]Unexpected repitition using fgets and sscanf

Here is a part of my code. 这是我的代码的一部分。 The aim of gets and sscanf is to scan three variables separated by exactly one space. getssscanf的目的是扫描三个变量,这些变量之间只有一个空格。 If passed, then output the instruction again. 如果通过,则再次输出指令。 Otherwise, output error and exit program. 否则,输出错误并退出程序。

I want to use 7 length char array to limit the number in the line, getting format like 'g 3 3' only. 我想使用7个长度的char数组来限制行中的数字,只获得像'g 3 3'这样的格式。 But it seems something wrong in my code. 但是在我的代码中似乎有问题。

#include <stdio.h> 

int main (void) {
    char line[7];
    char command;
    int x, y;

    while(1){
        /* problem: g  4 4 or g 4  4 can also pass */
        fgets(line, 7, stdin);
        nargs = sscanf(line, "\n%c %d %d", &command, &x, &y);

        if(nargs != 3){
          printf("error\n");
          return 0;
        }

        printf("%c %d %d\n", command, x, y);
    }
}

Unexpected: 意外:

g  4 4
g 4 4
error

expected: 预期:

g 4 4
g 4 4
// I can continue type

Can anyone tell me why it will still repeat the instruction? 谁能告诉我为什么它仍然会重复该指令?

According to the C11 standard, 7.21.6.2p5 : 根据C11标准7.21.6.2p5

A directive composed of white-space character(s) is executed by reading input up to the first non-white-space character (which remains unread), or until no more characters can be read. 通过读取输入直到第一个非空白字符(仍未读取)或直到无法读取更多字符为止,执行由空白字符组成的指令。

This describes the \\n directive and the two space characters as being identical in functionality: They'll match as much consecutive white-space (spaces, tabs, newlines, etc) as they can from the input. 这将\\n伪指令和两个空格字符描述为功能相同:它们将匹配输入中尽可能多的连续空格(空格,制表符,换行符等)。

If you want to match a single space (and only a single space), I suggest using %*1[ ] instead of the white-space directives. 如果要匹配一个空格(并且只有一个空格),建议使用%*1[ ]而不是空白指令。 You could use %*1[\\n] to similarly discard a newline. 您可以使用%*1[\\n]类似地丢弃换行符。 For example, since the newline character appears at the end of a line : 例如,由于换行符出现在行的末尾

nargs = sscanf(line, "%c%*1[ ]%d%*1[ ]%d%*1[\n]", &command, &x, &y);

This won't completely solve your problem, unfortunately, as the %d format specifier is also defined to discard white-space characters : 不幸的是,这不能完全解决您的问题,因为%d格式说明符也已定义为放弃空白字符

Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [ , c , or n specifier 除非规范包含[cn指定符,否则将跳过输入的空白​​字符(由isspace函数指定)

With some clever hacks, you might be able to continue using sscanf (or better yet, scanf without the intermediate buffer), but after comparing the alternatives in terms of cost on maintainability, we might as well just use getchar , so if you're looking for a solution to your problem as opposed to an answer to the question you posed, I'd recommend gsamaras answer . 有了一些聪明的技巧,您也许可以继续使用sscanf (或者更好的是,没有中间缓冲区的scanf ),但是在可维护性成本方面比较了替代方案之后,我们不妨只使用getchar ,所以如果您寻找解决问题的方法,而不是提出的问题,我建议gsamaras答案

What you have there won't work, since sscanf() won't be bothered if the user inputs one or two whitespaces. 您所拥有的将无法使用,因为如果用户输入一个或两个空格, sscanf()将不会受到困扰。

You could approach this in a simple way, by taking advantage of short circuiting and by using getchar() , like this: 您可以通过利用短路和使用getchar()的简单方式来实现,如下所示:

#include <stdio.h>
#include <ctype.h>

#define SIZE 100

int main(void) {
    int c, i = 0;
    char line[SIZE] = {0};
    while ((c = getchar()) != EOF) {
        // is the first char an actual character?
        if(i == 0 && !isalpha(c)) {
                printf("error\n");
                return -1;
        // do I have two whitespaces in 2nd and 4th position?
        } else if((i == 1 || i == 3) && c != ' ') {
                printf("error\n");
                return -1;
        // do I have digits in 3rd and 5th position?
        } else if((i == 2 || i == 4) && !isdigit(c)) {
                printf("error\n");
                return -1;
        // I expect that the user hits enter after inputing his command
        } else if(i == 5 && c != '\n') {
                printf("error\n");
                return -1;
        // everything went fine, I am done with the input, print it
        } else if(i == 5) {
                printf("%s\n", line);
        }
        line[i++] = c;
        if(i == 6)
                i = 0;
    }
    return 0;
}

Output: 输出:

gsamaras@gsamaras:~$ gcc -Wall px.c
gsamaras@gsamaras:~$ ./a.out 
g 4 4
g 4 4
g  4 4
error

Can anyone tell me why it will still repeat the instruction? 谁能告诉我为什么它仍然会重复该指令?

The tricky part is that "%d" consumes leading white-space, so code needs to detect leading white-space first. 棘手的部分是"%d"消耗了前导空格,因此代码需要首先检测前导空格。

" " consumes 0 or more white-space and never fails. " "消耗0或更多的空格,并且永不失败。

So "\\n%c %d %d" does not well detect the number of intervening spaces. 因此, "\\n%c %d %d"不能很好地检测到中间空格的数量。


If the int s can be more than 1 character, use this, else see below simplification. 如果int可以超过1个字符,请使用此字符,否则请参见以下简化。

Use "%n to detect location in the buffer of sscanf() progress. 使用"%n来检测sscanf()进程缓冲区中的位置。

It gets the job done using sscanf() which apparently is required. 它使用sscanf()完成工作,这显然是必需的。

// No need for a tiny buffer
char line[80];
if (fgets(line, sizeof line, stdin) == NULL) Handle_EOF();

int n[6];
n[5] = 0;
#define SPACE1 "%n%*1[ ] %n"
#define EOL1   "%n%*1[\n] %n"

// Return value not checked as following `if()` is sufficient to detect scan completion.
// See below comments for details
sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" EOL1, 
  &command, &n[0], &n[1],
  &x,       &n[2], &n[3],
  &y,       &n[4], &n[5]);

// If scan completed to the end with no extra
if (n[5] && line[n[5]] == '\0') {
  // Only 1 character between?
  if ((n[1] - n[0]) == 1 && (n[3] - n[2]) == 1 && (n[5] - n[4]) == 1) {
    Success(command, x, y);
  }
}

Maybe add test to insure command is not a whitespace, but I think that will happen anyway in command processing. 也许添加测试以确保command不是空白,但是我认为无论如何在命令处理中都会发生这种情况。


A simplification can be had if the int s must only be 1 digit and with a mod combining @Seb answer with the above. 如果int只能是1位数,并且将@Seb答案与上述内容结合在一起,则可以简化 This works because the length of each field is fixed in an acceptable answer. 这是可行的,因为每个字段的长度都固定在可接受的答案中。

// Scan 1 and only 1 space
#define SPACE1 "%*1[ ]"

int n = 0;
// Return value not checked as following `if()` is sufficient to detect scan completion.
sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" "%n", &command, &x, &y, &n);

// Adjust this to accept a final \n or not as desired.
if ((n == 5 && (line[n] == '\n' || line[n] == '\0')) {
  Success(command, x, y);
}

@Seb and I dove into the need for checking the return value of sscanf() . @Seb和我很需要检查sscanf()的返回值。 Although the cnt == 3 test is redundant since n == 5 will only be true when then entire line was scanned and sscanf() returns 3, a number of code checkers may raise a flag noting that the results of sscanf() is not checked. 尽管cnt == 3测试是多余的,因为n == 5仅在扫描整个行并且sscanf()返回3时才为true,但是许多代码检查器可能会发出一个标志,指出sscanf()的结果不是检查。 Not qualifying the results of sscanf() before using the saved variables is not robust code. 在使用保存的变量之前不对sscanf()的结果进行限定不是可靠的代码。 This approach uses a simple and sufficient check of n == 5 . 该方法使用简单且充分的n == 5检验。 Since many code problems stem from not doing any qualification , the lack of the check of the sscanf() can raise a false-positive amongst code checkers. 由于许多代码问题源于不做任何限定 ,因此缺少sscanf()的检查会在代码检查器中引起假阳性。 Easy enough to add the redundant check. 添加冗余检查很容易。

// sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" "%n", &command, &x, &y, &n);
// if (n == 5 && (line[n] == '\n' || line[n] == '\0')) {
int cnt = sscanf(line, "%c" SPACE1 "%d" SPACE1 "%d" "%n", &command, &x, &y, &n);
if (cnt == 3 && n == 5 && (line[n] == '\n' || line[n] == '\0')) {

you have a problem with program ? 你有程序问题吗? gdb is your best friend =) gdb是您最好的朋友=)

gcc -g yourProgram.c
gdb ./a.out
break fgets
run
finish
g 4  4

and then step through the statements, whenever you encounter scanf or printf just type finish, what you will see is that the program completed this iteration successfully but then the program did not wait for input and just printed error message ? 然后逐步执行这些语句,每当遇到scanf或printf时,只需键入finish,您将看到的是程序成功完成了该迭代,但是程序没有等待输入而只是打印了错误消息? why ? 为什么呢? well type : 井类型:

man fgets

fgets reads at most ONE LESS than size, so in your case, fgets is only allowed to read 6 characters, but you gave it 7! fgets读取的大小最多少于一个,因此,在您的情况下,fgets仅允许读取6个字符,但您给了7个字符! Yes the newline is a character just like the space, so what happens to the 7th ? 是的,换行符就像空格一样,是一个字符,那么第七行会发生什么? it will be buffered, which means that instead of reading from the keyboard, your program will see that there are characters in the buffer and will use them( one character in this example ). 它将被缓冲,这意味着您的程序将看到缓冲区中有字符并使用它们(在此示例中为一个字符),而不是从键盘上进行读取。 Edit : Here is what you can do to make your program work 编辑:这是使程序正常工作的方法
you can ignore empty lines, if ( strccmp(line, "\\n") == 0 ) then jump to the next iteration, and if you are not allowed to use strcmp a workaround would be comparing line[0]=='\\n'. 您可以忽略空行,如果(strccmp(line,“ \\ n”)== 0)则跳至下一个迭代,如果不允许使用strcmp,一种解决方法是比较line [0] =='\\ n'。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM