简体   繁体   中英

What are the real-world applications of huffman coding?

I am told that Huffman coding is used as loseless data compression algorithm , but I am also told that real data compress software do not employ Huffman coding, because if the keys are not distributed decentralized enough, the compressed file could be even larger than the orignal file.

This leaves me wondering are there any real-world application of Huffman coding?

Huffman is widely used in all the mainstream compression formats that you might encounter - from GZIP, PKZIP (winzip etc) and BZIP2, to image formats such as JPEG and PNG.

All compression schemes have pathological data-sets that cannot be meaningfully compressed; the archive formats I listed above simply 'store' such files uncompressed when they are encountered.

Newer arithmetic and range coding schemes are often avoided because of patent issues , meaning Huffman remains the work-horse of the compression industry.

See Wikipedia article on the subject:

Huffman coding today is often used as a "back-end" to some other compression method. DEFLATE (PKZIP's algorithm) and multimedia codecs such as JPEG and MP3 have a front-end model and quantization followed by Huffman coding.

There are quite a lot of real-world applications of Huffman Encoding. ZIP is perhaps the most widely used compression tool that uses Huffman Encoding as its basis. The latest of the most efficient lossless compression algorithms, Brotli Compression, released by Google last month also uses Huffman Coding. Apart from that, Brotli also uses LZ77 and a few other fundamental lossless compression algorithms. Refer to Brotli.

When one considers compression algorithms there are often benefits and disadvantages to each. It is the nature of compression that given a set of input, there exists better and worse compression algorithms for that data.

Huffman is really, really good at some things. Most notably with data that repeats order a lot and contains a sub-set of the character space. For example english language text files. The english language tends to have the same letters followed by the same other letters.

If your professor or book gave you the impression that Huffman is not used, they are wrong. For example almost all communications with and from the internet are at some point Huffman encoded. (A number of communication protocols use it.) Most image files (jpegs) are Huffman encoded. Most music files (mp3s) are Huffman encoded. There are many other examples.

One reason Huffman is used is because it can be "discovered" via a slightly different algorithm called adaptive Huffman. As you read the file you learn the Huffman code and "compress as you go". This is a simplified overview , but you get the idea.

To solve the use the best algorithm for the situation problem, zip files allow a number of different compressions to be used depending on what the best one is for a given file.

Huffman code is used to convert fixed length codes into varible length codes, which results in lossless compression. Variable length codes may be further compressed using JPEG and MPEG techniques to get the desired compression ratio.

A very widespread application is the encoding of strings in HPACK, the header compression technique of http/2 .

The RFC does directly provide a Huffman Code Table that is optimized for compressing HTTP headers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM