简体   繁体   中英

c BAD access fscanf with structs

I am reading from a text file which is formatted like so:

Firstname Surname Age NumberOfSiblings motherage dadage

Importing header files:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>

A struct is defined as follows:

typedef struct {
    int person_ID; //not included in file
    char full_name[20];
    char sex[2];
    char countryOfOrigin[20];
    int num_siblings;
    float parentsAges[2]; //this should store mother and fathers age in an array of type float
} PersonalInfo;


void viewAllPersonalInformation(){
    FILE* file = fopen("People.txt", "r");
    if (file == NULL){
        printf("File does not exist");
        return;
    }
    int fileIsRead = 0;
    int idCounter = 0;

    PersonalInfo People[1000];
    //headers
    printf("%2s |%20s |%2s |%10s |%2s |%3s |%3s\n", "ID", "Name", "Sex", "Born In", "Number of siblings", "Mother's age", "Father's Age");

    do{
        fileIsRead = fscanf(file, "%s %s %s %d %f %f\n", People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);

        People[idCounter].person_ID = idCounter;
        printf("%d %s %s %s %d %f %f\n", People[idCounter].person_ID, People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, People[idCounter].num_siblings, People[idCounter].parentsAges[0], People[idCounter].parentsAges[1]);
        idCounter++;
    }
    while(fileIsRead != EOF);
    fclose(file);


    printf("Finished reading file");
}


int main() {
    viewAllPersonalInformation();
    return 0;
}

Where People.txt looks like:

John O'Donnell F Ireland 3 32.5 36.1

Mary Mc Mahon M England 0 70 75

Peter Thompson F America 2 51 60

fscanf() will stop reading when a whitespace is met. You expect to read two strings with the format specifier %s , in the case of full name. %s stops as soon as a whitespace is found, so it will only store the first name in full_name, and the surname will go to the second %s , thus in countryOfOrigin .

So, if you want to read "Peter Thompson", then you would need to introduce two strings (char arrays) to store the first name and the last name, and then concatenate them.

However, since you want to read full names that vary in number of words, I suggest you use fgets() (which also has buffer overflow protection). For example "Peter Thompson" has 2 and "Mary Mc Mahon" has 3. So, if you'd stick with fscanf() , how many %s would you use? 2 or 3? You don't know, it depends on the input, which you get on runtime. Maybe there is some regex to do the trick with fscanf() , but believe that using fgets() and then parsing the line of the file read is better for practice.


Now that we read a line of file with fgets() , what do we do with that? We still don't know the number of words each full name consists of! How to find out? By counting the whitespaces the line contains. If it contains w whitespaces, then it has w + 1 tokens (could be words, numbers or characters in your example).

With a simple if-else statement, we can differentiate between these two scenarios in your example, when there are 6 spaces (7 tokens) and 7 spaces (8 tokens for "Mary Mc Mahon M England 0 70 75").

Now, how to extract from the string (the line) to the tokens (full name, age and so on)? We could have a loop and use a bunch of if-else statements to say, until I found the 2nd (or 3rd depending on the number of whitespaces) spaces, I am going to append the current token to the full_name . Then, next token will be the sex, and so on.

Sure you could do that, but since I am bit lazy, I will just base myself on your good work with fscanf() , and use sscanf() instead, to extract the tokens. Of course with this approach, we need to use one or two (depending on the number of spaces) extra strings, in order to temporarily store the surname (before we append it to the name with strcat() ).

Minimal Complete Working Example:

#include <stdio.h>
#include <string.h>

#define P 1000 // Max number of people
#define L 256  // Max length of line read from file (-1)

typedef struct {
    int person_ID; //not included in file
    char full_name[32];
    char sex[2];
    char countryOfOrigin[16];
    int num_siblings;
    float parentsAges[2];
} PersonalInfo;

int count_whitespaces(char* str)
{
    int whitespaces_count = 0;
    while(*str)
    {
        if(*str == ' ')
            whitespaces_count++;
        str++;
    }
    return whitespaces_count;
}

void viewAllPersonalInformation(){
    FILE* file = fopen("People.txt", "r");
    if (file == NULL){
        printf("File does not exist");
        return;
    }
    int fileIsRead = 0;
    int idCounter = 0;

    PersonalInfo People[P];
    // line of file, placeholder for biworded surnames, surname.
    char line[L], str[8], surname[16];
    //headers
    // You have 7 format specifiers for the headers, but only 6 six in fscanf!!!
    printf("%2s |%5s |%2s |%10s |%2s |%3s |%3s\n", "ID", "Name", "Sex", "Born In", "Number of siblings", "Mother's age", "Father's Age");

    // read into 'line', from 'file', up to 255 characters (+1 for the NULL terminator)
    while(fgets(line, L, file) != NULL) {
        //fileIsRead = fscanf(file, "%s %s %s %s %d %f %f\n", People[idCounter].full_name, People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);
        // eat trailing newline of fgets
        line[strcspn(line, "\n")] = 0;

        // Skip empty lines of file
        if(strlen(line) == 0)
            continue;

        if(count_whitespaces(line) == 6)
        {
            sscanf(line, "%32s %16s %c %16s %d %f %f", People[idCounter].full_name, surname, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);
        }
        else // 7 whitespaces, thus 8 token in the string
        {
            sscanf(line, "%32s %8s %16s %c %16s %d %f %f", People[idCounter].full_name, str, surname, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);
            // Separate name and first word of surname with a space
            strcat(People[idCounter].full_name, " ");
            strcat(People[idCounter].full_name, str);
        }

        // Separate name and surname with a space
        strcat(People[idCounter].full_name, " ");
        strcat(People[idCounter].full_name, surname);

        People[idCounter].person_ID = idCounter;
        printf("%d %s %s %s %d %f %f\n", People[idCounter].person_ID, People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, People[idCounter].num_siblings, People[idCounter].parentsAges[0], People[idCounter].parentsAges[1]);
        idCounter++;
        if(idCounter == P)
        {
            printf("Max number of people read, stop reading any more data.\n");
            break;
        }
    };
    fclose(file);

    printf("Finished reading file.\n");
}


int main() {
    viewAllPersonalInformation();
    return 0;
}

Output:

ID | Name |Sex |   Born In |Number of siblings |Mother's age |Father's Age
0 John O'Donnell F Ireland 3 32.500000 36.099998
1 Mary Mc Mahon M England 0 70.000000 75.000000
2 Peter Thompson F America 2 51.000000 60.000000
Finished reading file.

Did you notice the numbers in the format specifiers of sscanf() ? They are guarding from buffer overflows .


What about Dynamic Memory Allocation ?

In the code above, I estimated the maximum length of name, country of origin and such. Now how about having those sizes dynamic? We could, but we would still need an initial estimation.

So, we could read the name in a temporary array of fixed length, and then find the actual length of the string with strlen() . With that information in hand, we are now able to dynamically allocate memory (pointing by a char pointer), and then copy with strcpy() the string from the temp array to its final destination.

If you have a pointer field char *full_name , it means just a pointer which should be initialized by some existent object, in case of char * it usually should be an array of char. You may fix it in two ways:

  • Just make an array field like char full_name[100] , and pass a maximum length of a string to a scanf format string, like %100s , and this is the simplest way;
  • Use a malloc function, and don't forget to free that address, or assign some valid address to a pointer in some other way, eg declare an array as a usual autostorage variable, and assign an address of zero-index element to your pointer field, and remember that after leaving your function the address of your autostorage variable will become invalid.

There is another trouble. The %s conversion specifier tells fscanf to read a single word until any whitespace character like a space, so according to your input format your full_name field will be read until the first space, and any fürther attempt to read an integer will fail.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM