[英]How to Write a binary file in c++
I'm trying to implement the Huffman's encoding algorithm in c++. 我正在尝试用c ++实现Huffman的编码算法。
my question is : after i got the equivalent binary string for each character , how can i write those zeros and ones as binary on a file not as string 0 or string 1 ? 我的问题是:在我得到每个字符的等效二进制字符串后,如何将这些0和1作为二进制写入文件而不是字符串0或字符串1?
thanks in advance ... 提前致谢 ...
Obtaining individually the encoding of each character in a different data structure is a broken solution, because you need to juxtapose the encoding of each character in the resulting binary file: storing them individually makes that as hard as directly storing them contiguously in a vector of bits . 单独获取不同数据结构中每个字符的编码是一个破碎的解决方案,因为你需要在生成的二进制文件中并置每个字符的编码:单独存储它们就像在比特向量中连续存储它们一样难。
This consideration suggests using a std::vector<bool>
to perform your task, but it is a broken solution because it can't be treated as a c-style array, and you really need that at output time. 这种考虑建议使用
std::vector<bool>
来执行你的任务,但它是一个破碎的解决方案,因为它不能被视为一个c风格的数组,你真的需要在输出时。
This question asks precisely which are the valid alternatives to std::vector<bool>
, so I think answers to that question fits perfectly your question. 这个问题确切地询问哪些是
std::vector<bool>
的有效替代品,所以我认为这个问题的答案非常适合你的问题。
BTW, what I would do is to just wrap a std::vector<uint8_t>
under a class which suits yout needs, like the code attached: 顺便说一句,我要做的就是将
std::vector<uint8_t>
包装在一个适合你需要的类下面,比如附带的代码:
#include <iostream>
#include <vector>
#include <cstdint>
#include <algorithm>
class bitstream {
private:
std::vector<std::uint8_t> storage;
unsigned int bits_used:3;
void alloc_space();
public:
bitstream() : bits_used(0) { }
void push_bit(bool bit);
template <typename T>
void push(T t);
std::uint8_t *get_array();
size_t size() const;
// beware: no reference!
bool operator[](size_t pos) const;
};
void bitstream::alloc_space()
{
if (bits_used == 0) {
std::uint8_t push = 0;
storage.push_back(push);
}
}
void bitstream::push_bit(bool bit)
{
alloc_space();
storage.back() |= bit << 7 - bits_used++;
}
template <typename T>
void bitstream::push(T t)
{
std::uint8_t *t_byte = reinterpret_cast<std::uint8_t*>(&t);
for (size_t i = 0; i < sizeof(t); i++) {
uint8_t byte = t_byte[i];
if (bits_used > 0) {
storage.back() |= byte >> bits_used;
std::uint8_t to_push = (byte & ((1 << (8 - bits_used)) - 1)) << bits_used;
storage.push_back(to_push);
} else {
storage.push_back(byte);
}
}
}
std::uint8_t *bitstream::get_array()
{
return &storage.front();
}
size_t bitstream::size() const
{
const unsigned int m = 0;
return std::max(m, (storage.size() - 1) * 8 + bits_used);
}
bool bitstream::operator[](size_t size) const
{
// No range checking
return static_cast<bool>((storage[size / 8] >> 7 - (size % 8)) & 0x1);
}
int main(int argc, char **argv)
{
bitstream bs;
bs.push_bit(true);
std::cout << bs[0] << std::endl;
bs.push_bit(false);
std::cout << bs[0] << "," << bs[1] << std::endl;
bs.push_bit(true);
bs.push_bit(true);
std::uint8_t to_push = 0xF0;
bs.push_byte(to_push);
for (size_t i = 0; i < bs.size(); i++)
std::cout << bs[i] << ",";
std::cout << std::endl;
}
I hope this code can help you. 我希望这段代码可以帮到你。
char byte
) char byte
) else
branch can be removed if byte
is set to 0
after each file writing operation (or, more generically, every time it has been totally filled and flushed somewhere else), so only 1s
must be written. else
如果可以去除分支byte
被设置为0
的每个文件的写入操作(或者,更一般地,每次它已被完全填补,冲洗别处时),所以只有经过1s
必须写。 void writeBinary(char *huffmanEncoding, int sequenceLength)
{
char byte = 0;
// For each bit of the sequence
for (int i = 0; i < sequenceLength; i++) {
char bit = huffmanEncoding[i];
// Add a single bit to byte
if (bit == 1) {
// MSB of the sequence to msb of the file
byte |= (1 << (7 - (i % 8)));
// equivalent form: byte |= (1 << (-(i + 1) % 8);
}
else {
// MSB of the sequence to msb of the file
byte &= ~(1 << (7 - (i % 8)));
// equivalent form: byte &= ~(1 << (-(i + 1) % 8);
}
if ((i % 8) == 0 && i > 0) {
//writeByteToFile(byte);
}
}
// Fill the last incomplete byte, if any, and write to file
}
You cant write to a binary file with only bits; 你不能写只有位的二进制文件; the smallest size of data written is one byte (thus 8 bits).
写入的最小数据大小是一个字节(因此是8位)。
So what you should do is create a buffer (any size). 所以你应该做的是创建一个缓冲区(任何大小)。
char BitBuffer;
Writing to a buffer: 写入缓冲区:
int Location;
bool Value;
if (Value)
BitBuffer |= (1 << Location);
else
BitBuffer &= ~(1 << Location)
The code (1 << Location)
generates a number with all 0's except the position specified by Location
. 代码
(1 << Location)
生成一个全0的数字,但Location
指定的Location
除外。 Then, if Value
is set to true, it sets corresponding bit in Buffer to 1, and to 0 in other case. 然后,如果
Value
设置为true,则将Buffer中的相应位设置为1,而将其他情况设置为0。 The binary operations used are fairly simple, if you don't understand them, it should be in any good C++ book/tutorial. 使用的二进制操作非常简单,如果你不理解它们,它应该在任何好的C ++书籍/教程中。
Location should be number in range <0, sizeof(Buffer)-1>, so <0,7> in this case. 位置应该是范围<0,sizeof(缓冲区)-1>的数字,在这种情况下是<0,7>。
Writing buffer to a file is relatively simple when using fstream. 使用fstream时,将缓冲区写入文件相对简单。 Just remember to open it as binary.
只记得把它打开成二进制文件。
ofstream File;
File.open("file.txt", ios::out | ios::binary);
File.write(BitBuffer, sizeof(char))
EDIT: Noticed a bug and fixed it. 编辑:注意到一个错误并修复它。
EDIT2: You can't use <<
operators in binary mode, i forgot about it. EDIT2:你不能在二进制模式下使用
<<
运算符,我忘了它。
Alternative solution : Use std::vector<bool>
or std::bitset
as a buffer. 替代解决方案 :使用
std::vector<bool>
或std::bitset
作为缓冲区。
This should be even simpler, but I thought I could help you a little bit more. 这应该更简单,但我想我可以帮助你一点点。
void WriteData (std::vector<bool> const& data, std::ofstream& str)
{
char Buffer;
for (unsigned int i = 0; i < data.size(); ++i)
{
if (i % 8 == 0 && i != 0)
str.write(Buffer, 1);
else
// Paste buffer setting code here
// Location = i/8;
// Value = data[i];
}
// It might happen that data.size() % 8 != 0. You should fill the buffer
// with trailing zeros and write it individually.
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.