Reading a file as binary, compressing and writing it back as binary

Question

we have been given the assignment to implement a Shannon Fano compression algorithm in C++. While the algorithm wasn't much of a problem, I have some trouble with reading the given files (txt, excel, BMP) as binary for compression purposes.

The prof. gave us a few tips on it, but I don't get how they are supposed to be used. He said we should make an object that takes in the path to the file. It then has methods to read a bit, read a byte, read an integer and read a float from the binary file. While I get what readBin and readByte do, I don't understand how one would use the readInt or readFloat method (how the fstream would know that the next char is an int or a float).

Does anyone have any idea on how to implement the binary reading with the methods I have listed above? Thanks!

Answer 1

Unless you need to take into account the internal format of the different files (BMP, XLSX, etc) to improve compression, for your use case I don't think there's any particular reason to treat them as anything but a binary stream: a bunch of bytes to which you apply the compression algorithm.

I suggest you take a look at this answer, where you have a quite simple example on how to read a binary file in C++: https://stackoverflow.com/a/16435334/9390121

Once you have the file read in memory, it's a matter of compressing it and writing it back to disk (ie. write() instead of read() ).

Answer 2

While I get what readBin and readByte do, I don't understand how one would use the readInt or readFloat method (how the fstream would know that the next char is an int or a float).

Well, you don't need any of that for this application. You just need to read in all the binary data, compress it, and write out the compressed data. The same process with uncompression instead of compression will reverse the process.

But to answer your question, these are the steps:

Define precisely what each byte will mean in your format. For example, for readInt , you may choose to use four bytes expressing a signed four-byte integer in big-endian format.
Read the appropriate number of bytes. So, for a four-byte readInt , you would read four bytes. Probably into a char * .
Parse the bytes according to your format into whatever type you want to return.
Return that value.

Again, you don't need to do any of this for your assignment.

Answer 3

Not sure if it's necessary to not just read the whole file as bytes, but if you need a getInt() or getFloat(), here:

template<typename T>
T readType(std::ifstream& ifile){
    T result;
    ifile.read((char*)&result, sizeof(T));
    return result;
}

Example use:

std::ifstream ifile("file.txt", std::ios::binary);
int i = readType<int>(ifile);

Reading a file as binary, compressing and writing it back as binary

Question

3 answers

solution1
1 2020-05-18 06:42:14

solution2
1 2020-05-18 06:51:50

solution3
0 2020-05-18 07:11:01

Reading a file as binary, compressing and writing it back as binary

Question

3 answers

solution1 1 2020-05-18 06:42:14

solution2 1 2020-05-18 06:51:50

solution3 0 2020-05-18 07:11:01

solution1
1 2020-05-18 06:42:14

solution2
1 2020-05-18 06:51:50

solution3
0 2020-05-18 07:11:01