[英]How to Use Huffman code for compress file?
My program stores Huffman code in a char[8]
variable.我的程序将霍夫曼代码存储在
char[8]
变量中。 I want to store it in an unsigned char
variable.我想将它存储在一个
unsigned char
变量中。 I do it, but don't think it works correctly because when I used the following code to extract my file it didn't work:我这样做了,但认为它不能正常工作,因为当我使用以下代码提取文件时,它不起作用:
unsigned char bit2byte ( unsigned char bits[8] ) {
unsigned char x = 0;
for ( int k = 0; k < 8; k++ ) {
if ( bits[k] == '1' )
x = x | 1;
x <<= 1;
}
return x;
}
What about this line:这条线怎么样:
if ( bits[k] == '1' )
does the bits
array store your bits as ASCII characters or as digital values, ie what happens if you try bits
数组是否将您的位存储为 ASCII 字符或数字值,即如果您尝试会发生什么
if ( bits[k] == 0x01 )
You'll probably downvote me for not being able to read your mind...您可能会因为无法读懂您的想法而对我投反对票...
Huffman is an compression scheme, and if you want to read a Huffman encoded file you most likly want to decode it (ie uncompress it) Huffman 是一种压缩方案,如果您想读取 Huffman 编码文件,您最可能希望对其进行解码(即解压缩)
http://en.wikipedia.org/wiki/Huffman_coding http://en.wikipedia.org/wiki/Huffman_coding
In Huffman encoded data, each character is represented as a variable number of bits, and hence you cannot process a file by simply passing in a fixed portion of a file expecting to return a single byte in each call -- you have to keep state of how many bits are consumed in each call, and where to start processing in the bit stream for the extracting the next byte.在霍夫曼编码数据中,每个字符都表示为可变位数,因此您不能通过简单地传递文件的固定部分来处理文件,期望在每次调用中返回一个字节 - 您必须保留 state每次调用消耗了多少位,以及从 stream 位开始处理以提取下一个字节。
To correctly decode Huffman data, you will need the encoding tree (see the wikipedia link) -- this tree is most likely stored within the files as well -- so really your file will most likely have two parts: (1) The encoding/decoding tree, and (2) the data -- how that is stored in the file is implementation specific, so you will need the specification for that first before you attempt to decode anything.要正确解码 Huffman 数据,您将需要编码树(参见 wikipedia 链接)——这棵树很可能也存储在文件中——所以实际上你的文件很可能有两个部分:(1)编码/解码树,以及 (2) 数据——如何存储在文件中是特定于实现的,因此在尝试解码任何内容之前,您首先需要该规范。
Hope this helps.希望这可以帮助。
I'm not clear on what you mean by "doesn't work" but it could be you need to go the other way.我不清楚您所说的“不起作用”是什么意思,但可能是您需要 go 以另一种方式。
for (int k = 7; k >= 0; k--) {
and everything else as before.和以前一样。
Of course, I also don't know why you ever use 8 bytes to store only 8 bits of information.当然,我也不知道为什么你曾经使用 8 个字节来只存储 8 位信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.