简体   繁体   中英

Reading bytes in c++

I'm trying to read bytes from binary file but to no success. I've tried many solutions, but I get no get result. Struct of file:

[offset] [type]          [value]          [description] 
0000     32 bit integer  0x00000803(2051) magic number 
0004     32 bit integer  60000            number of images 
0008     32 bit integer  28               number of rows 
0012     32 bit integer  28               number of columns 
0016     unsigned byte   ??               pixel 
0017     unsigned byte   ??               pixel 
........ 
xxxx     unsigned byte   ??               pixel

How I tried (doesn't work):

auto myfile = fopen("t10k-images.idx3-ubyte", "r");
char buf[30];
auto x = fread(buf, 1, sizeof(int), myfile);

Read the bytes as unsigned char :

ifstream if;

if.open("filename", ios::binary);

if (if.fail())
{
    //error
}

vector<unsigned char> bytes;

while (!if.eof())
{
    unsigned char byte;

    if >> byte;

    if (if.fail())
    {
        //error
        break;
    }

    bytes.push_back(byte);
}

if.close();

Then to turn multiple bytes into a 32-bit integer for example:

uint32_t number;

number = ((static_cast<uint32_t>(byte3) << 24)
    | (static_cast<uint32_t>(byte2) << 16) 
    | (static_cast<uint32_t>(byte1) << 8) 
    | (static_cast<uint32_t>(byte0)));

This should cover endian issues. It doesn't matter if int shows up as B0B1B2B3 or B3B2B1B0 on the system, since the conversion is handled by bit shifts. The code doesn't assume any particular order in memory.

This is how you read an uint32_t from a file:

auto f = fopen("", "rb"); // not the b, for binary files you need to specify 'b'

std::uint32_t magic = 0;
fread (&magic, sizeof(std::uint32_t), 1, f);

Hope this helps.

Knowing the endianness of your file layout whence reading multi-byte numerics is important. Assuming big-endian is always the written format, and assuming the value is indeed a 32bit unsigned value:

uint32_t magic = 0;
unsigned char[4] bytes;
if (1 == fread(bytes, sizeof(bytes), 1, f))
{
   magic = (uint32_t)((bytes[0] << 24) | 
                      (bytes[1] << 16) | 
                      (bytes[2] << 8) | 
                      bytes[3]);
}

Note: this will work regardless of whether the reader (your program) is little endian or big-endian. I'm sure I missed at least one cast in there, but hopefully you get the point. The only safe, and portable way of reading multi-byte numerics is to (a) know the endianness they were written with, and (b) read-and-assemble them byte by byte.

The C++ stream library function read() can be used for binary file I/O. Given the code example from the link, I would start like this:

std::ifstream myfile("t10k-images.idx3-ubyte", std::ios::binary);
std::uint32_t magic, numim, numro, numco;

myfile.read(reinterpret_cast<char*>(&magic), 4);
myfile.read(reinterpret_cast<char*>(&numim), 4);
myfile.read(reinterpret_cast<char*>(&numro), 4);
myfile.read(reinterpret_cast<char*>(&numco), 4);

// Changing byte order if necessary
//endswap(&magic);
//endswap(&numim);
//endswap(&numro);
//endswap(&numco);

if (myfile) {
    std::cout << "Magic = "  << magic << std::endl
              << "Images = " << numim << std::endl
              << "Rows = "   << numro << std::endl
              << "Cols = "   << numco << std::endl;
}

If the byte order (Endianness) should be reversed you could write a simple reverse function like this one: endswap()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM