简体   繁体   中英

C++ binary files I/O, data lost when writing

I am learning C++ with the "Programming: Principles and Practice Using C++" book from Bjarne Stroustrup. I am currently studying chapter 11 and I found an example on how to read and write binary files of integers (section 11.3.2). I played around with the example and used a.txt file (input.txt) with a sentence which I read and wrote to another file (output.txt) (text_to_binary fnc) and then read and wrote back to the original file (input.txt) (binary_to_text fnc).

#include<fstream>
#include<iostream>

using namespace std;

void text_to_binary(ifstream &ifs, ofstream &ofs)
{
    for (int x; ifs.read(as_bytes(x), sizeof(char));)
    {
        ofs << x << '\n';
    }
    ofs.close();
    ifs.close();
}

void binary_to_text(ifstream &ifs, ofstream &ofs)
{
    for (int x; ifs >> x;)
    {
        ofs.write(as_bytes(x), sizeof(char));
    }
    ifs.close();
    ofs.close();
}

int main()
{
    string iname = "./chapter_11/input.txt";
    string oname = "./chapter_11/output.txt";

    ifstream ifs{iname, ios_base::binary};
    ofstream ofs{oname, ios_base::binary};

    text_to_binary(ifs, ofs);

    ifstream ifs2{oname, ios_base::binary};
    ofstream ofs2{iname, ios_base::binary};

    binary_to_text(ifs2, ofs2);

    return 0;
}

I figured out that I have to use sizeof( char ) rather than sizeof( int ) in the.read and.write command. If I use the sizeof(int) some chars of the.txt file go missing when I write them back to text. Funnily enough chars only goes missing if

x%4.= 0 (x = nb of chars in .txt file)

example with sizeof(int):

input.txt: hello this is an amazing test. 1234 is a number everything else doesn't matter..asd hello this is an amazing test. 1234 is a number everything else doesn't matter..asd

(text_to_binary fnc) results in:

output.txt:

1819043176
1752440943
1763734377
1851859059
1634558240
1735289210
1936028704
824192628
540291890
1629516649
1836412448
544367970
1919252069
1768453241
1696622446
543519596
1936027492
544483182
1953784173
774795877

(binary_to_text fnc) results back in:

input.txt: hello this is an amazing test. 1234 is a number everything else doesn't matter.. hello this is an amazing test. 1234 is a number everything else doesn't matter.. asd went missing.

Now to my question, why does this happen? Is it because int's are saved as 4 bytes?

Bonus question: Out of interest, is there a simpler/more efficient way of doing this?

edit: updated the question with the results to make it hopefully more clear

When you attempt to do a partial read, the read will attempt to go beyond the end of the file and the eof flag will be set for the stream. This makes its use in the loop condition false so the loop ends.

You need to checkgcount of the stream after the loop to see if any bytes was actually read into the variable x .

But note that partial reads will only write to parts of the variable x , leaving the rest indeterminate . Exactly which parts depends on the system endianness , and using the variable with its indeterminate bits will lead to undefined behavior .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM