简体   繁体   English

C ++读取二进制文件

[英]C++ reading binary files

I want to understand how does reading binary files work in C++. 我想了解读取二进制文件如何在C ++中工作。 My code: 我的代码:

int main() {
    ifstream ifd("input.png",ios::binary |ios::ate);
    int size = ifd.tellg();
    ifd.seekg(0,  ios::beg);
    vector<char> buffer;
    buffer.reserve(size);
    ifd.read(buffer.data(), size);

    cout << buffer.data();
    return 0;
}

I thought that if I cout my buffer I would get the result in binary but that is not the case. 我认为,如果我输入我的缓冲区,我会得到二进制的结果,但事实并非如此。

My output is: ˙Ř˙á6Exif

And if I read the text file it displays the text in normal form not in binary. 如果我读取文本文件,它将以正常形式显示文本而不是二进制文本。 Obviously my logic is not right here. 显然我的逻辑不正确。 How can I read files to a buffer so it will contain binary values? 如何将文件读取到缓冲区以使其包含二进制值? Ps I`m doing this for implementing a Shannon-Fano algorithm so if anyone has any advice on reading a binary file I would be grateful. Ps我这样做是为了实现Shannon-Fano算法,所以如果有人对阅读二进制文件有任何建议,我将不胜感激。

You need to resize your vector, not reserve it: 您需要调整矢量大小,而不是保留它:

int main()
{
    ifstream ifd("input.png", ios::binary | ios::ate);
    int size = ifd.tellg();
    ifd.seekg(0, ios::beg);
    vector<char> buffer;
    buffer.resize(size); // << resize not reserve
    ifd.read(buffer.data(), size);

    cout.write(buffer.data(), buffer.size()); // you cannot just output buffer to cout as the buffer won't have '\0' ond-of-string terminator
}

Otherwise your code tries to read size characters into an empty buffer. 否则,您的代码会尝试将size字符读入空缓冲区。 You may as well use vector constructor that sets vector size: vector<char> buffer(size); 您也可以使用设置矢量大小的向量构造函数: vector<char> buffer(size);

You can output byte values of your buffer this way: 您可以这样输出缓冲区的字节值:

void dumpbytes(const vector<char>& v)
{
    for (int i=0; i<v.size(); ++i)
    {
        printf("%u ", (unsigned char)v[i]);
        if ((i+1) % 16 == 0)
            printf("\n");
    }
    printf("\n");
}

Or something like common hex editors do for hex output: 或者像常见的十六进制编辑器那样用于十六进制输出:

void dumphex(const vector<char>& v)
{
    const int N = 16;
    const char hex[] = "0123456789ABCDEF";
    char buf[N*4+5+2];
    for (int i = 0; i < v.size(); ++i)
    {
        int n = i % N;
        if (n == 0)
        {
            if (i)
                puts(buf);
            memset(buf, 0x20, sizeof(buf));
            buf[sizeof(buf) - 2] = '\n';
            buf[sizeof(buf) - 1] = '\0';
        }
        unsigned char c = (unsigned char)v[i];
        buf[n*3+0] = hex[c / 16];
        buf[n*3+1] = hex[c % 16];
        buf[3*N+5+n] = (c>=' ' && c<='~') ? c : '.';
    }
    puts(buf);
}

Buffer with "Hello World!" 缓冲区“Hello World!” data would be printed as follows: 数据印刷如下:

48 65 6C 6C 6F 20 57 6F 72 6C 64 21                  Hello World!

Opening a file in binary mode means that your operating system won't transparently translate line endings between the CR/LF/CRLF formats. 以二进制模式打开文件意味着您的操作系统不会透明地转换CR / LF / CRLF格式之间的行结尾。

It doesn't have any effect at all on how your computer prints a string, seven lines later. 它对你的计算机如何打印一个字符串没有任何影响,七行之后。 I don't know what "get the result in binary" means, but I suggest rendering the contents of your vector<char> by printing its constituent bytes, one at a time, in their hex-pair representation: 我不知道“得到二进制结果”是什么意思,但我建议通过以十六进制对表示方式一次打印一个组成字节来渲染vector<char>的内容:

std::cout << std::hex << std::setfill('0');
for (const auto byte : buffer)
   std::setw(2) << byte;

The output will look something like: 输出看起来像:

0123456789abcdef0123456789abcdef

Every two characters represents the 0-255 byte value of a byte in your data, using the base-16 (or "hex") numerical system. 每两个字符代表数据中一个字节的0-255字节值,使用base-16(或“hex”)数字系统。 This is a common representation of non-text information. 这是非文本信息的常见表示。

Alternatively, you could output the data in base-2 (literally "binary"). 或者, 您可以输出base-2中的数据 (字面意思是“二进制”)。

It's up to you how to present the information. 由您决定如何呈现信息。 The file open mode has nothing to do with your vector. 文件打开模式与矢量无关。

You also need to fix your vector's size; 你还需要修复矢量的大小; at the moment you call .reserve when you meant .resize . 在你打电话的那一刻。 .reserve你的意思.resize

Based on Pavel answer, you can also add this to see the data in real binary, namely 0 's and 1 s. 基于Pavel回答,你也可以添加它来查看真实二进制数据,即01秒。 do not forget to include the bitset header. 不要忘记包含bitset标头。

void dumpbin(const vector<char>& v)
{
    for (int i = 0; i < v.size(); ++i)
    {
        cout <<bitset<8>((unsigned char)(v[i])) << " ";
        if ((i + 1) % 8 == 0)
            printf("\n");
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM