简体   繁体   English

在PHP中创建霍夫曼树

[英]Creating a Huffman Tree in PHP

I am trying to create a Huffman Tree using PHP. 我正在尝试使用PHP创建霍夫曼树。

This is what I have: 这就是我所拥有的:

<!DOCTYPE html>
<html>
    <body>
        <?php
            $extract = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse elementum imperdiet aliquet. Duis non molestie orci. Ut eget nibh nec augue ultricies porttitor.';
            $characters = count_chars($extract, 1);            
            asort($characters);
            foreach($characters as $character => $occurrence)
            {
               echo 'There';
               if($occurrence > 1)
               {
                   echo ' were ' . $occurrence . ' occurrences of ';
               }
               else
               {
                   echo ' was '. $occurrence . ' occurrence of ';
               }
               echo '"<strong>' . chr($character) . '</strong>" in the extract.<br />';
               $characterFreq[chr($character)] = $occurrence;              
            }
            print_r($characterFreq);
        ?>
    </body>
</html>

Which outputs: 哪个输出:

There was 1 occurrence of "S" in the extract.
There was 1 occurrence of "U" in the extract.
There was 1 occurrence of "h" in the extract.
There was 1 occurrence of "L" in the extract.
There was 1 occurrence of "D" in the extract.
There was 1 occurrence of "," in the extract.
There was 1 occurrence of "q" in the extract.
There was 1 occurrence of "b" in the extract.
There were 3 occurrences of "g" in the extract.
There were 4 occurrences of "a" in the extract.
There were 4 occurrences of "." in the extract.
There were 4 occurrences of "d" in the extract.
There were 5 occurrences of "p" in the extract.
There were 6 occurrences of "c" in the extract.
There were 6 occurrences of "l" in the extract.
There were 7 occurrences of "m" in the extract.
There were 8 occurrences of "n" in the extract.
There were 8 occurrences of "r" in the extract.
There were 9 occurrences of "u" in the extract.
There were 9 occurrences of "o" in the extract.
There were 10 occurrences of "s" in the extract.
There were 15 occurrences of "t" in the extract.
There were 17 occurrences of "i" in the extract.
There were 20 occurrences of "e" in the extract.
There were 22 occurrences of " " in the extract.
Array ( [S] => 1 [U] => 1 [h] => 1 [L] => 1 [D] => 1 [,] => 1 [q] => 1 [b] => 1 [g] => 3 [a] => 4 [.] => 4 [d] => 4 [p] => 5 [c] => 6 [l] => 6 [m] => 7 [n] => 8 [r] => 8 [u] => 9 [o] => 9 [s] => 10 [t] => 15 [i] => 17 [e] => 20 [ ] => 22 )

I have been using mixtures of array_slice() , array_splice() and array_unshift() but am having trouble with recursion. 我一直在使用array_slice()array_splice()array_unshift()混合物,但是递归遇到了麻烦。

Ideally, leaves and branches will be denoted by array indices 0 & 1. 理想情况下,树叶和树枝用数组索引0和1表示。

Any ideas on how to make a Huffman tree in multidimensional array form would be greatly appreciated. 任何有关如何制作多维数组形式的霍夫曼树的想法都将不胜感激。

It's a full solution to the problem, in PHP: 在PHP中,这是对问题的完整解决方案:

function huffmannEncode($string) {
    $originalString = $string;
    $occurences = array();

    while (isset($string[0])) {
        $occurences[] = array(substr_count($string, $string[0]), $string[0]);
        $string = str_replace($string[0], '', $string);
    }

    sort($occurences);
    while (count($occurences) > 1) {
        $row1 = array_shift($occurences);
        $row2 = array_shift($occurences);
        $occurences[] = array($row1[0] + $row2[0], array($row1, $row2));
        sort($occurences);
    }

    // $dictionary is an array that gets filled with the values with a recursive method
    $dictionary = [];
    fillDictionary($dictionary, is_array($occurences[0][1]) ? $occurences[0][1] : $occurences);

    // Generate the final encoded message
    $encoded = '';
    for($i = 0; $i < strlen($originalString); $i++) {
        $encoded .= $dictionary[$originalString[$i]];
    }
    return $encoded;
}

// This function runs recursively to generate the Huffmann tree
function fillDictionary(&$dictionary, $data, $value = '') {
    if (!is_array($data[0][1])) {
        $dictionary[$data[0][1]] = $value.'0';
    } else {
        fillDictionary($dictionary, $data[0][1], $value.'0');
    }
    if (isset($data[1])) {
        if (!is_array($data[1][1])) {
            $dictionary[$data[1][1]] = $value.'1';
        } else {
            fillDictionary($dictionary, $data[1][1], $value.'1');
        }
    }
}

This function calculates the occurrences, and with a recursive sub-function it generates the dictionary that assigns a binary code to each character. 该函数计算出现次数,并使用递归子函数生成字典,该字典为每个字符分配一个二进制代码。

Here is an example: 这是一个例子:

// Test the functionality:
echo huffmannEncode('hello world');
// Output: 00100010101101110011110010101111
// And the dictionary:
/* 
[e] => 000
[h] => 001
[r] => 010
[w] => 011
[l] => 10
[o] => 110
[ ] => 1110
[d] => 1111
*/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM