简体   繁体   中英

C: Reading a text file into a struct array

I've been trying to wrap my brain around this problem for the past 2 days and still have no idea how to do this. Pretty much I have to make a function( getRawData ) that is passed a pointer to a file already open for reading, and passed an array of structs and the number of records currently in that array(through the paramer currSize). The function is to read the data from the file into the array placing it at the end of the array and will return the total number of records in the file after reading. My problem is that I don't really know how to approach reading the file and storing the data into the array struct, I've been trying to read up on files/records and I'm still at a loss on what to do. The function in question is getRawData

The entire program actually has two files not one(malebabynames.csv and femalebabynames.csv) and the point of the program is to get the program to ask the user for their name, then check through the records of the files to see how popular the name is. Since some boy names and girl names are interchangeable, both files need to be read for every name the user inputs. Here's the code I have so far:

    #include <stdio.h>
#include <string.h>
#include <ctype.h>

 struct nameRecord {
    char name[31];
    int year;
    int frequency;
};
struct nameRecord;
void allCaps(char[]);
int getRawData(FILE*, struct nameRecord[], int);
void setYearTotals(struct nameRecord[], int, int);
void setNameYearTotals(char, struct nameRecord[], int, int);
void getPerHundredThousand(int, int, double);
void printData(double);
void graphPerHundredThousand(double);

int main(void)
{
    int currSizem = 0;
    int currSizef = 0;
    struct nameRecord records[currSizem];
    FILE* fp = NULL;
    FILE* fp2 = NULL;
    char name[31];
    printf("Please enter your name: ");
    scanf("%30[^\n]", name);
    printf("your name is %s\n", name);

//opening both male and female name files and reading them in order to get the total number of records in the array
    fp = fopen("malebabynames.csv", "r");
    if (fp != NULL) {
        printf("file opened\n");
        while(3 == fscanf(fp, "%[^,],%d,%d", records[currSizem].name, records[currSizem].year, records[currSizem].frequency)) {
            currSizem++;
        }
    } else {
        printf("file failed to open\n");
    }

    fp2 = fopen("femalebabynames.csv", "r");
    if (fp != NULL) {
        printf("file opened\n");
        while(3 == fscanf(fp2, "%[^,],%d,%d", records[currSizef].name, &records[currSizef].year, &records[currSizef].frequency)) {
            currSizef++;
        }
    } else {
        printf("file failed to open\n");
    }
    return 0;
}


//function that automatically capitalizes the users inputted name
void allCaps(char s[]) {
    while(*s != '\0') {
        *s = toupper((unsigned char) *s);
        s++;
    }
}
//function that reads and places the read files into the struct arrays
int getRawData(FILE* fp, struct nameRecord records[], int currSize) {
    for(i = 0; i < currSize; i++) {
        fscanf(fp, "%[^,],%d,%d", records[i].name, &records[i].year, &records[i].frequency);
    }
}

Exact instructions as per the website:

Your program will ask the user to enter a name (you may assume that the name will be no more than 30 characters long). It will then find the popularity of the name between 1921 and 2010 and print out a chart and graph. The program will then ask the user if they wish to do another analysis and repeat the process.

The program will pull information from the following data sources in determining the popularity of a name.

 ontario female baby names ontario male baby names 

Note that some names are considered both male and female so your program will needs the data from both files regardless of the name entered.

Due to the fact that the number of people born has changed over time (you expect that there were more births in 1945 than there were in 1920), you cannot simply use the raw numbers in determining popularity. Suppose there were 100 births and 50 of the babies were named "Michael" in 1920 vs 1000 births with 55 of the babies named Michael in 1950. In this case the popularity of "Michael" was actually higher in 1920 than in 1950.

The datasets were collected from 1917 to 2010. Your program will determine the how many times a name occurs per hundred thousand births in 5 year intervals starting from the year 1921 (you can ignore data before 1921). Thus, if the user entered "Allison", you will determine the popularity of "Allison" from 1921 to 1925, 1926-1930, 1931-1935, ..., 1996-2010 (there are a total of 18 five year periods between 1921 and 2010). You can assume that there are no more than 150,000 records in total in the two files combined.

In each case the total population is the sum of all births within the year range for both males and females. The number with the name you are interested in is the sum of all births with the given name in the year range. Your program will present the data in as both a value (number of babies with the name per hundred thousand births) and a graph of that data.

You're on the right track now to read each line from the file into a record. For allocating memory for the record array, if you know that there will be (say) no more than 1000 names in each file, the easy lazy way is to just declare struct nameRecord records[1000] .

But if you've been explicitly instructed that there is no limit to how many records, and the program needs to figure that out when it's run, you will need to allocate the array dynamically:

struct nameRecord *records;

...
/* after counting currSizem lines in the file */
records = malloc(currSizem * sizeof(struct nameRecord));

Now you can reread the file, knowing that there are exactly as many records as you need.

When you don't need the records data any more, make sure to call free() to return the memory that you allocated:

free(records);

Something to consider: do you need to treat female and male baby names differently? Or just measure their popularity across all babies? If they need to be checked separately then you will need to keep two arrays, one for each file. But if the point is to read both files into a single data structure, then you will need to:

  • Allocate enough memory in the array to hold both files
  • Watch for duplicate names when reading them in

Is the GetRawData function correct at all? what changes would I need to make to make it read into the struct arrays correctly?

Provided that you have ensured that records[] is sufficiently dimensioned for 150,000 records, GetRawData() needs the minimal changes to detect and return the number of records in the file.

int getRawData(FILE *fp, struct nameRecord records[], int currSize)
{
    for (i = 0; i < currSize; i++)
        if (fscanf(fp, "%[^,],%d,%d ", records[i].name,
                                      &records[i].year,
                                      &records[i].frequency) < 3) break;
    return i;
}

Then you can replace the while loops in main() by

        currSizem = getRawData(fp, records, 150000);

and

        currSizef = getRawData(fp, records+currSizem, 150000-currSizem);

- after this, records 0 through currSizem-1 will hold the males, and records currSizem through currSizem+currSizef-1 will hold the females.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM