简体   繁体   中英

Weird line ending conversion (CR, LF, CRLF) with istreambuf_iterator<char>(ifstream(…, ios::binary))

I'm writing a CRC32 routine in MSVC++2010 and need to read a file in binary mode, byte by byte.

I'm doing it with ifstream and istreambuf_iterator , and it generally works, but it does some weird things to line endings.

For example, if I have a file with the contents

LF LF LF CR CR CR

the output of my program is

 (10) (10) (13) (13) (13) (13)

so basically, it replaced the last LF with a CR. Strange.

If I have

CR CR LF CR

it's

 (13) (10) (13) (13)

so it swapped the CRLF! It also swaps them when there are more in the file.

Is there any workaround? I'd like to stick to C++ for this and I actually want to read binary files with no interpretation of line endings whatsoever (and I thought only istream_iterator would do that)!


Just for completeness, the testing code I have is this, adapted from pyCRC :

static inline crc_t crc_update(crc_t                          crc, 
                               std::istreambuf_iterator<char> data, 
                               long long                      data_len)
{
    unsigned int tbl_idx;
    while (data_len--) 
    {
        tbl_idx = (crc ^ *data) & 0xff;
        crc = (crc_table[tbl_idx] ^ (crc >> 8)) & 0xffffffff;

        data++;

        std::cout << " (" << int(*data) << ")";
    }
    return crc & 0xffffffff;
}

int main(int argc, char* argv[])
{
    std::ifstream file(argv[1], std::ios::in | std::ios::binary);
    struct _stati64 filestats;
    errno_t stat_error = _stati64(argv[1], &filestats);
    if (stat_error != 0)
        return errno;
    std::cout << crc_finalize(
                      crc_update(crc_init(),
                                 std::istreambuf_iterator<char>(file),
                                 filestats.st_size));
}

It is not the last LF with CR is missing, it is the first LF in the file is missing, as you did data++; before std::cout << " (" << int(*data) << ")";

Tried it myself and it isn't missing anything ...

btw: I'm using g++.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM