简体   繁体   中英

C programming language (scanf)

I have read strings with spaces in them using the following scanf() statement.

scanf("%[^\n]", &stringVariableName);

What is the meaning of the control string [^\\n] ?

Is is okay way to read strings with white space like this?

This mean "read anything until you find a '\\n'"

This is OK, but would be better to do this "read anything until you find a '\\n', or read more characters than my buffer support"

char stringVariableName[256] = {}
if (scanf("%255[^\n]", stringVariableName) == 1)
    ...

Edit: removed & from the argument, and check the result of scanf.

The format specifier "%[^\\n]" instructs scanf() to read up to but not including the newline character. From the linked reference page:

matches a non-empty sequence of character from set of characters. 

If the first character of the set is ^, then all characters not
in the set are matched. If the set begins with ] or ^] then the ]
character is also included into the set.

If the string is on a single line, fgets() is an alternative but the newline must be removed as fgets() writes it to the output buffer. fgets() also forces the programmer to specify the maximum number of characters that can be read into the buffer, making it less likely for a buffer overrun to occur:

char buffer[1024];
if (fgets(buffer, 1024, stdin))
{
    /* Remove newline. */
    char* nl = strrchr(buffer, '\n');
    if (nl) *nl = '\0';
}

It is possible to specify the maximum number of characters to read via scanf() :

scanf("%1023[^\n]", buffer);

but it is impossible to forget to do it for fgets() as the compiler will complain. Though, of course, the programmer could specify the wrong size but at least they are forced to consider it.

Technically, this can't be well defined.

Matches a nonempty sequence of characters from a set of expected characters (the scanset).

If no l length modifier is present, the corresponding argument shall be a pointer to the initial element of a character array large enough to accept the sequence and a terminating null character, which will be added automatically.

Supposing the declaration of stringVariableName looks like char stringVariableName[x]; , then &stringVariableName is a char (*)[x]; , not a char * . The type is wrong. The behaviour is undefined. It might work by coincidence, but anything that relies on coincidence doesn't work by my definition.

The only way to form a char * using &stringVariableName is if stringVariableName is a char ! This implies that the character array is only large enough to accept a terminating null character. In the event where the user enters one or more characters before pressing enter, scanf would be writing beyond the end of the character array and invoking undefined behaviour. In the event where the user merely presses enter, the %[...] directive will fail and not even a '\\0' will be written to your character array .


Now, with that all said and done, I'll assume you meant this: scanf("%[^\\n]", stringVariableName); ( note the omitted ampersand )

You really should be checking the return value!!

A %[ directive causes scanf to retrieve a sequence of characters consisting of those specified between the [ square brackets ] . A ^ at the beginning of the set indicates that the desired set contains all characters except for those between the brackets. Hence, %[^\\n] tells scanf to read as many non- '\\n' characters as it can, and store them into the array pointed to by the corresponding char * .

The '\\n' will be left unread. This could cause problems. An empty field will result in a match failure. In this situation, it's possible that no data will be copied into your array (not even a terminating '\\0' character). For this reason (and others), you really need to check the return value !

Which manual contains information about the return values of scanf? The scanf manual .

Reading from the man pages for scanf() ...

[ Matches a non-empty sequence of characters from the specified set of accepted characters; the next pointer must be a pointer to char, and there must be enough room for all the characters in the string, plus a terminating null byte. The usual skip of leading white space is suppressed. The string is to be made up of characters in (or not in) a particular set; the set is defined by the characters between the open bracket [ character and a close bracket ] character. The set excludes those characters if the first character after the open bracket is a circumflex (^). To include a close bracket in the set, make it the first character after the open bracket or the circumflex; any other position will end the set. The hyphen character - is also special; when placed between two other characters, it adds all intervening characters to the set. To include a hyphen, make it the last character before the final close bracket. For instance, [^]0-9-] means the set "everything except close bracket, zero through nine, and hyphen". The string ends with the appearance of a character not in the (or, with a circumflex, in) set or when the field width runs out.

In a nutshell, the [^\\n] means that read everything from the string that is not a \\n and store that in the matching pointer in the argument list.

Other people have explained what %[^\\n] means.

This is not an okay way to read strings. It is just as dangerous as the notoriously unsafe gets , and for the same reason: it has no idea how big the buffer at stringVariableName is.

The best way to read one full line from a file is getline , but not all C libraries have it. If you don't, you should use fgets , which knows how big the buffer is, and be aware that you might not get a complete line (if the line is too long for the buffer).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM