简体   繁体   中英

How to code a, and read in a binary tree of modified(!) Huffman codes in PHP

I am writing a class for the decoding of fax data encoded withmodified Huffman code.

Data is coded line by line: data describes each pixel row. Lines are coded as records of variable length. The pixel bits are stored in the bits of code words, least significant first.

Recently the code word list (182 elements) is defined as an array:

/**
 * [0] code word
 * [1] length of code word
 * [2] run length of color bits
 * [3] 0 = white / 1 = black
 * [4] 1 = termination codes / 0 = make up codes
 */
const CODEWORDS = [
   [0b00110101, 8, 0, 0, 1],             // termination codes white
   [0b000111, 6, 1, 0, 1],
   [0b0111, 4, 2, 0, 1],
   [0b1000, 4, 3, 0, 1],
   [0b1011, 4, 4, 0, 1],
   [0b1100, 4, 5, 0, 1],
   [0b1110, 4, 6, 0, 1],
   [0b1111, 4, 7, 0, 1],
   [0b10011, 5, 8, 0, 1],
   ...
];

Before usage the array is sorted in descending order according to the length of the code words.

In a first approach I´m able to find the correct code words with repeating foreach -iterations over this array - but it's (not surprising!) terribly slow .

It is clear to me, that an increase in performance can only be achieved using a binary tree. But even after looking at several explanations here or solutions (libraries) in GitHub, I can't find access to

  • how to transfer the data from the array into a binary tree
  • how to browse the tree to get the right leaf

If someone could help me there, I would be very grateful.

Once you have the correct codes (see my comments on your question), then you start by building one set of codes for white and one for black. For each, you start the tree with a branch for the first bit being zero, and another branch for one. Break up your set of codes into two sets, one set where all the codes start with zero, and the other where they all start with one. For each of those, make two branches. Break up each set based on the second bit. Once you get to a branch with one code and you just used the last bit of that code, you now have a leaf. In that leaf you store the symbol for the code, eg 63 for white code 00110100 . If you get to a branch and there are no codes, then you again have a leaf, but this time it will result in a decoding error if it is reached.

To decode, take the first bit and go down that branch. Choose the second branch depending on the second bit. And so on until you get to a leaf. Then emit that symbol and start back again at the root with the subsequent bit. Or terminate if you end up at an error leaf.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM