简体   繁体   中英

How to read little endian integers from file in C++?

Say I have a binary file; it contains positive binary numbers, but written in as 32-bit integers写成 32 位整数

How do I read this file? I have this right now.

int main() {
    FILE * fp;
    char buffer[4];
    int num = 0;
    fp=fopen("file.txt","rb");
    while ( fread(&buffer, 1, 4,fp) != 0) {

        // I think buffer should be 32 bit integer I read,
        // how can I let num equal to 32 bit little endian integer?
    }
    // Say I just want to get the sum of all these binary little endian integers,
    // is there an another way to make read and get sum faster since it's all 
    // binary, shouldnt it be faster if i just add in binary? not sure..
    return 0;
}

This is one way to do it that works on either big-endian or little-endian architectures:

int main() {
    unsigned char bytes[4];
    int sum = 0;
    FILE *fp=fopen("file.txt","rb");
    while ( fread(bytes, 4, 1,fp) != 0) {
        sum += bytes[0] | (bytes[1]<<8) | (bytes[2]<<16) | (bytes[3]<<24);
    }
    return 0;
}

If you are using linux you should look here ;-)

It is about useful functions such as le32toh

From CodeGuru :

inline void endian_swap(unsigned int& x)
{
    x = (x>>24) | 
        ((x<<8) & 0x00FF0000) |
        ((x>>8) & 0x0000FF00) |
        (x<<24);
}

So, you can read directly to unsigned int and then just call this.

while ( fread(&num, 1, 4,fp) != 0) {
    endian_swap(num); 
    // conversion done; then use num
}

If you are working with short files, I recommend the simple use of the class stringstream and then the function stoul. The code below reads byte per byte (in this case 2 bytes) from an ifstream and writes them in hex inside a string stream. Then thanks to stoul converts the string into a 16 bit integer:

#include <sstream>
#include <iomanip>   

using namespace std;

ifstream is("filename.bin", ios::binary);
if(!is) { /*Error*/ }
is.unsetf(ios_base::skipws); 

stringstream ss;
uint8_t byte1, byte2;
uint16_t val;

is >> byte1; is >> byte2;
ss << setw(2) << setfill('0') << hex << static_cast<size_t>(byte1);
ss << setw(2) << setfill('0') << hex << static_cast<size_t>(byte2);
val = static_cast<uint16_t>(stoul(ss.str(), nullptr, 16));

cout << val << endl;

For example if you have to read from a binary file, a 16 bit integer stored in Big Endian (00 f3), you put it inside a stringstream ("00f3") and then convert it in a integer (243). The example writes the value in hex, but it could be dec or oct, even binary, using the class bitset. The iomanip functions (setw, setfill) are used to give a correct format to the sstream. The bad of this method is that it's tremendously slow if you have to work with files large in size.

You read the code normally. However when you go to interpret the data you need to make the proper conversions.

This can be a pain in the butt as if you want to make your code portable, ie to run in both little and big endian machines, you need to handle all types of combinations: little to big, big to little, little to little and big to big. In the last two cases a no-op.

Fortunately this all can be automated with the boost::endian library . An example from their documentation:

#include <iostream>
#include <cstdio>
#include <boost/endian/arithmetic.hpp>
#include <boost/static_assert.hpp>

using namespace boost::endian;

namespace
{
  //  This is an extract from a very widely used GIS file format.
  //  Why the designer decided to mix big and little endians in
  //  the same file is not known. But this is a real-world format
  //  and users wishing to write low level code manipulating these
  //  files have to deal with the mixed endianness.

  struct header
  {
    big_int32_t     file_code;
    big_int32_t     file_length;
    little_int32_t  version;
    little_int32_t  shape_type;
  };

  const char* filename = "test.dat";
}

int main(int, char* [])
{
  header h;

  BOOST_STATIC_ASSERT(sizeof(h) == 16U);  // reality check

  h.file_code   = 0x01020304;
  h.file_length = sizeof(header);
  h.version     = 1;
  h.shape_type  = 0x01020304;

  //  Low-level I/O such as POSIX read/write or <cstdio>
  //  fread/fwrite is sometimes used for binary file operations
  //  when ultimate efficiency is important. Such I/O is often
  //  performed in some C++ wrapper class, but to drive home the
  //  point that endian integers are often used in fairly
  //  low-level code that does bulk I/O operations, <cstdio>
  //  fopen/fwrite is used for I/O in this example.

  std::FILE* fi = std::fopen(filename, "wb");  // MUST BE BINARY

  if (!fi)
  {
    std::cout << "could not open " << filename << '\n';
    return 1;
  }

  if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
  {
    std::cout << "write failure for " << filename << '\n';
    return 1;
  }

  std::fclose(fi);

  std::cout << "created file " << filename << '\n';

  return 0;
}

After compiling and executing endian_example.cpp, a hex dump of test.dat shows:

01020304 00000010 01000000 04030201

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM