简体   繁体   中英

Reading .csv file using fgets() function

I have a .csv file which contains 4 rows and 2 columns of data as displayed below:

0;0
0;1
1;0
1;1

My goal is to read the data and store it into a multidimensional array define as trIn[4][2]. The problem is that whenever I run the programme and print the value stored, the values stored in row 1 (or row 0) are wrong and seem random. Here is the code I wrote, I would much appreciate it if anyone could help.

FILE *inptr;
inptr = fopen("in.csv", "r");
if(inptr == NULL)
{
    printf("ERROR: Could not open file 'in.csv'\n");
    return 1;
}

// store the values from .csv file to mutl-dim arrays
char content[100];
int i;
for(i=0; i<4; i++)
{
    fgets(content, 100, inptr);
    trIn[i][0] = content[0] - '0';
    trIn[i][1] = content[2] - '0';

    for(i=0; i<4; i++)
    {
        printf("trIn[%d][0] = %d\ntrIn[%d][1] = %d\n", i, trIn[i][0], i, trIn[i][1]);
    }
}
return 0;

What gets displayed on my terminal is the following

trIn[0][0] = -65
trIn[0][1] = -113
trIn[1][0] = 0
trIn[1][1] = 1
trIn[2][0] = 1
trIn[2][1] = 0
trIn[3][0] = 1
trIn[3][1] = 1

Your data file starts with a BOM — Byte Order Mark — probably because it is on (or from) Windows and contains UTF-8 data.

See the Unicode FAQ on BOM — Byte Order Mark . The bytes in the BOM would be 0xEF, 0xBB, 0xBF.

This little program shows that when you subtract '0' from the bytes of the BOM, you get the values you see in your output:

#include <stdio.h>

int main(void)
{
    char bom[] = "\xEF\xBB\xBF";

    for (int i = 0; bom[i] != '\0'; i++)
        printf("%d: 0x%.2X => %d\n", i, (unsigned char)bom[i], bom[i] - '0');
    return 0;
}

The output is:

0: 0xEF => -65
1: 0xBB => -117
2: 0xBF => -113

As you can see, the mapped values at index 0 and 2 are -65 and -113, exactly as you found.

You can skip the BOM by code such as:

char *data = content;
if (strncmp(data, "\xEF\xBB\xBF", 3) == 0)
    data += 3;   /* Skip BOM */

You then analyze data[0] and data[2] . Your scanning algorithm is very fragile, but the BOM just makes it harder. It can be a problem with files generated on Windows. Unix systems usually do not put a BOM at the start of UTF-8 data files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM