简体   繁体   English

如何使用霍夫曼代码压缩文件?

[英]How to Use Huffman code for compress file?

My program stores Huffman code in a char[8] variable.我的程序将霍夫曼代码存储在char[8]变量中。 I want to store it in an unsigned char variable.我想将它存储在一个unsigned char变量中。 I do it, but don't think it works correctly because when I used the following code to extract my file it didn't work:我这样做了,但认为它不能正常工作,因为当我使用以下代码提取文件时,它不起作用:

unsigned char bit2byte ( unsigned char bits[8] ) {
    unsigned char x = 0; 

    for ( int k = 0; k < 8; k++ ) {
        if ( bits[k] == '1' ) 
            x = x | 1;

        x <<= 1; 
    }

    return x; 
}

What about this line:这条线怎么样:

if ( bits[k] == '1' ) 

does the bits array store your bits as ASCII characters or as digital values, ie what happens if you try bits数组是否将您的位存储为 ASCII 字符或数字值,即如果您尝试会发生什么

if ( bits[k] == 0x01 )

You'll probably downvote me for not being able to read your mind...您可能会因为无法读懂您的想法而对我投反对票...

Huffman is an compression scheme, and if you want to read a Huffman encoded file you most likly want to decode it (ie uncompress it) Huffman 是一种压缩方案,如果您想读取 Huffman 编码文件,您最可能希望对其进行解码(即解压缩)

http://en.wikipedia.org/wiki/Huffman_coding http://en.wikipedia.org/wiki/Huffman_coding

In Huffman encoded data, each character is represented as a variable number of bits, and hence you cannot process a file by simply passing in a fixed portion of a file expecting to return a single byte in each call -- you have to keep state of how many bits are consumed in each call, and where to start processing in the bit stream for the extracting the next byte.在霍夫曼编码数据中,每个字符都表示为可变位数,因此您不能通过简单地传递文件的固定部分来处理文件,期望在每次调用中返回一个字节 - 您必须保留 state每次调用消耗了多少位,以及从 stream 位开始处理以提取下一个字节。

To correctly decode Huffman data, you will need the encoding tree (see the wikipedia link) -- this tree is most likely stored within the files as well -- so really your file will most likely have two parts: (1) The encoding/decoding tree, and (2) the data -- how that is stored in the file is implementation specific, so you will need the specification for that first before you attempt to decode anything.要正确解码 Huffman 数据,您将需要编码树(参见 wikipedia 链接)——这棵树很可能也存储在文件中——所以实际上你的文件很可能有两个部分:(1)编码/解码树,以及 (2) 数据——如何存储在文件中是特定于实现的,因此在尝试解码任何内容之前,您首先需要该规范。

Hope this helps.希望这可以帮助。

I'm not clear on what you mean by "doesn't work" but it could be you need to go the other way.我不清楚您所说的“不起作用”是什么意思,但可能是您需要 go 以另一种方式。

for (int k = 7; k >= 0; k--) {

and everything else as before.和以前一样。

Of course, I also don't know why you ever use 8 bytes to store only 8 bits of information.当然,我也不知道为什么你曾经使用 8 个字节来只存储 8 位信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM