getc with Windows vs Unix

Question

I have a question regarding the following code:

while((c = getc(pFile)) != EOF)
{
    if(c != '\n')
    {
         input[index] = (char)c;
         index++;
     } else
     {
         input[index] = '\0';
         index = 0;
     }
}

In Windows, this c = getc line reads '\n' (code 10) twice. For example, I'm reading in the file with the following 2 lines:

Hello world
Test

c = getc reads in Hello world, but reads in 10 (\n) and 10 once more, resetting the input array to blank (because of the '\0'). In unix, the '\n' only gets read once, so it all works.

Any idea?

Thanks in advance.

Answer 1

Is the file physically the same, ie bit-by-bit, on the two platforms? That's asking for trouble, since the encoding for line ending differs.

Answer 2

Windows terminate lines with \r\n . May be this could help:

$ echo test | unix2dos > /tmp/test
$ hexdump -c /tmp/test
0000000   t   e   s   t  \r  \n                                        
0000006

Stangely \r value is 13, so I dont know wath is going wrong.

Answer 3

try this:

while((c = getc(pFile)) != EOF)
{
    if(c != '\n' && index)
    {
         input[index] = (char)c;
         index++;
    } 
    else
    {
         if (!index)
              continue; // dumps repeated '\n'

         input[index] = '\0';
         index = 0;
    }
}

getc with Windows vs Unix

Question

3 answers

solution1
1 ACCPTED 2011-07-07 17:32:23

solution2
0 2011-07-07 17:33:57

solution3
0 2011-07-07 17:51:57

getc with Windows vs Unix

Question

3 answers

solution1 1 ACCPTED 2011-07-07 17:32:23

solution2 0 2011-07-07 17:33:57

solution3 0 2011-07-07 17:51:57

solution1
1 ACCPTED 2011-07-07 17:32:23

solution2
0 2011-07-07 17:33:57

solution3
0 2011-07-07 17:51:57