简体   繁体   中英

Any simple way to read a string of variable length in C?

I tried reading using:

char *input1, *input2;
scanf("%s[^\n]", input1);
scanf("%s[^\n]", input2);

I am obviously doing something wrong because the second string is read as null. I know using scanf() is not recommended but I couldn't find any other simple way to do the same.

The statement:

char *input1, *input2;

allocates memory for two pointers to char . Note that this only allocated memory for that pointers — which are uninitialised and aren't pointing to anything meaningful — not what they're pointing to.

The call to scanf() then tries to write to memory out of bounds, and results in undefined behaviour.

You could instead, declare character arrays of fixed size with automatic storage duration:

char input1[SIZE];

This will allocate memory for the array, and the call to scanf() will be valid.

Alternatively, you could allocate memory dynamically for the pointers with one of the memory allocation functions:

char *input1 = malloc (size);

This declares a pointer to char whose contents are indeterminate, but are immediately overwritten with a pointer to a chunk of memory of size size . Note that the call to malloc() may have failed. It returns NULL as an error code, so check for it.

But scanf() should not be used as a user-input interface. It does not guard against buffer overflows, and will leave a newline in the input buffer (which leads to more problems down the road).

Consider using fgets instead. It will null-terminate the buffer and read at most size - 1 characters.

The calls to scanf() can be replaced with:

fgets (buf, sizeof buf, stdin);

You can then parse the string with sscanf , strtol , et cetera.

Note that fgets() will retain the trailing newline if there was space. You could use this one-liner to remove it:

buf [strcspn (buf, "\n\r") = '\0`;

This takes care of the return carriage as well, if any.

Or if you wish to continue using scanf() (which I advise against), use a field width to limit input and check scanf() 's return value:

scanf ("%1023s", input1); /* Am using 1023 as a place holder */

That being said, if you wish to read a line of variable length, you need to allocate memory dynamically with malloc() , and then resize it with realloc() as necessary.

On POSIX-compliant systems, you could use getline() to read strings of arbitrary length, but note that it's vulnerable to a DOS attack.

You can use m modifier to format specifier. Note that it is not standard C but rather a standard POSIX extension .

char *a, *b;

scanf("%m[^\n] %m[^\n]", &a, &b);

// use a and b
printf("*%s*\n*%s*\n", a, b);

free(a);
free(b);

There are 2 simple ways to read variable length strings from the input stream:

  • using fgets() with an array large enough for the maximum length:
    char input1[200];
    if (fgets(input1, sizeof input1, stdin)) {
        /* string was read. strip the newline if present */
        input1[strcspn(input1, "\n")] = '\0';
        ...
    } else {
        /* nothing was read: premature end of file? */
        ...
    }
  • on POSIX compliant systems, you can use getline() to read strings of arbitrary length into arrays allocated with malloc() :
    char *input1 = NULL;
    size_t input1_size = 0;
    ssize_t input1_length = getline(&input1, &input1_size, stdin);

    if (input1_length >= 0) {
        /* string was read. length is input1_length */
        if (input1_length > 0 && input1[input1_length - 1] == '\n') {
            /* remove the newline if present */
            input1[--input1_length] = '\0';
        }
        ...
    } else {
        /* nothing was read: premature end of file? */
        ...
    }

Using scanf is not recommended because it is difficult to use correctly and reading input with "%s" or "%[^\n]" without a specified maximum length is risky as any sufficiently long input will cause a buffer overflow and undefined behavior. Passing uninitialized pointers to scanf as you do in the posted code has undefined behavior.

Any simple way to read a string of variable length in C?

Unfortunately the answer is NO

The input functions (eg scanf , fgets , etc.) specified by the C standard all requires the caller to provide the input buffer. Once the input buffer is full, the functions will (when used correctly) return. So if the input is longer than the size of the provided buffer, the functions will only read partial input. So the caller must add code to check for partial input and do additional function calls as needed.

Posix systems has the getline and getdelim functions that can do it. So if you can accept limiting your code to Posix compliant systems, that's what you want to use.

If you need portable, standard compliant code, you need to write your own function. For that you need to look into functions like realloc , fgets , strcpy , memcpy , etc. It's not a simple task but it's not "rocket science" either. It's been done many, many times before... and if you search the.net, it's very likely you can find an open source implementation that you can just copy (make sure to follow the rules for doing that).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM