it is well known that Huffman code with minimum variance is preferable. I've digged through entire Polish/English internet and this is what I found: to build Huffman code with minimum variance you need to break ties with one of the following methods (of course probability of node is the most important):
the problem is, that I couldn't find any proof of correctness of any of these methods. Can someone proof any of these?
I will gladly clarify anything.
Some systems have an even stronger constraint than "when there's a tie, make the choice that minimizes the maximum depth of a tree" -- they set a hard limit on the maximum depth of the tree ( length-limited, also called minimum variance Huffman coding ):
"Whether there is a tie or not, build a tree with a maximum depth of at most 16 steps, so the maximum codeword length is 16 bits" (as in the Huffman codes used in JPEG image compression ) ( Jpeg huffman coding procedure )
"Whether there is a tie or not, build a tree with a maximum depth of at most 15 steps, so the maximum codeword length is 15 bits" (as in the Huffman codes used in DEFLATE and the Huffman codes used in gzip
"Whether there is a tie or not, build a tree with a maximum depth of at most 12 steps, so the maximum codeword length is 12 bits" ( "Huff0 uses a 12-bit limit." )
My understanding is that people have developed several heuristic algorithms for limiting Huffman codeword lengths that work OK, but the heuristics don't always give exactly the best possible compression .
Several people mention "optimal length-limited Huffman codes", and apparently there exist more than one algorithm to find them:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.