简体   繁体   中英

Huffman Coding Tree traversal

Writing Huffman Coding Algorithm in Python. Have successfully managed to create a tree based on a string input but am stuck on the best way to traverse it while generating the codes for each letter.

from collections import Counter
    class HuffNode:
        def __init__(self, count, letter=None):
            self.letter = letter
            self.count = count
            self.right = None
            self.left = None

    word = input()
    d = dict(Counter(word))

    Nodes = [HuffNode(d[w], w) for w in sorted(d, key=d.get, reverse=True)]

    while len(Nodes) > 1:
        a = Nodes.pop()
        b = Nodes.pop()
        c = HuffNode(a.count+b.count)
        c.left, c.right = a, b

        Nodes.append(c)
        Nodes.sort(key=lambda x: x.count, reverse=True)

For a word like "hello".

d = dict(Counter(word)) would get the frequency of each letter in the string and convert it to a dict. Thus having {'e': 1, 'l': 2, 'h': 1, 'o': 1} Each letter if then converted to a HuffNode and stored in Nodes

The while loop then proceeds to generate a tree until we only have one root

When the loop exits I'll have: Whats the best way to traverse this tree then generating the codes for each letter? Thanks

Generally speaking, you would want a recursive function that, given a HuffNode h and a prefix p:

  • if h.letter is not empty (ie h is a leaf), yields (p, h.letter) -> this is the code for the letter
  • otherwise, calls itself on h.left with prefix p + '0' and on h.right with p + '1'

A possible implementation (not tested, may have typos):

def make_code(node, prefix):
    if node is None:
        return []
    if node.letter is not None:
        return [(prefix, node.letter)]
    else:
        result = []
        result.extend(make_code(h.left, prefix + '0'))
        result.extend(make_code(h.right, prefix + '1'))
        return result

codes = make_code(root, '')

where root is the Huffman tree you built in the first step. The first test ( if node is None ) is to manage lopsided nodes, where one of the children may be empty.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM