簡體   English   中英

在PHP中創建霍夫曼樹

[英]Creating a Huffman Tree in PHP

我正在嘗試使用PHP創建霍夫曼樹。

這就是我所擁有的:

<!DOCTYPE html>
<html>
    <body>
        <?php
            $extract = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse elementum imperdiet aliquet. Duis non molestie orci. Ut eget nibh nec augue ultricies porttitor.';
            $characters = count_chars($extract, 1);            
            asort($characters);
            foreach($characters as $character => $occurrence)
            {
               echo 'There';
               if($occurrence > 1)
               {
                   echo ' were ' . $occurrence . ' occurrences of ';
               }
               else
               {
                   echo ' was '. $occurrence . ' occurrence of ';
               }
               echo '"<strong>' . chr($character) . '</strong>" in the extract.<br />';
               $characterFreq[chr($character)] = $occurrence;              
            }
            print_r($characterFreq);
        ?>
    </body>
</html>

哪個輸出:

There was 1 occurrence of "S" in the extract.
There was 1 occurrence of "U" in the extract.
There was 1 occurrence of "h" in the extract.
There was 1 occurrence of "L" in the extract.
There was 1 occurrence of "D" in the extract.
There was 1 occurrence of "," in the extract.
There was 1 occurrence of "q" in the extract.
There was 1 occurrence of "b" in the extract.
There were 3 occurrences of "g" in the extract.
There were 4 occurrences of "a" in the extract.
There were 4 occurrences of "." in the extract.
There were 4 occurrences of "d" in the extract.
There were 5 occurrences of "p" in the extract.
There were 6 occurrences of "c" in the extract.
There were 6 occurrences of "l" in the extract.
There were 7 occurrences of "m" in the extract.
There were 8 occurrences of "n" in the extract.
There were 8 occurrences of "r" in the extract.
There were 9 occurrences of "u" in the extract.
There were 9 occurrences of "o" in the extract.
There were 10 occurrences of "s" in the extract.
There were 15 occurrences of "t" in the extract.
There were 17 occurrences of "i" in the extract.
There were 20 occurrences of "e" in the extract.
There were 22 occurrences of " " in the extract.
Array ( [S] => 1 [U] => 1 [h] => 1 [L] => 1 [D] => 1 [,] => 1 [q] => 1 [b] => 1 [g] => 3 [a] => 4 [.] => 4 [d] => 4 [p] => 5 [c] => 6 [l] => 6 [m] => 7 [n] => 8 [r] => 8 [u] => 9 [o] => 9 [s] => 10 [t] => 15 [i] => 17 [e] => 20 [ ] => 22 )

我一直在使用array_slice()array_splice()array_unshift()混合物,但是遞歸遇到了麻煩。

理想情況下,樹葉和樹枝用數組索引0和1表示。

任何有關如何制作多維數組形式的霍夫曼樹的想法都將不勝感激。

在PHP中,這是對問題的完整解決方案:

function huffmannEncode($string) {
    $originalString = $string;
    $occurences = array();

    while (isset($string[0])) {
        $occurences[] = array(substr_count($string, $string[0]), $string[0]);
        $string = str_replace($string[0], '', $string);
    }

    sort($occurences);
    while (count($occurences) > 1) {
        $row1 = array_shift($occurences);
        $row2 = array_shift($occurences);
        $occurences[] = array($row1[0] + $row2[0], array($row1, $row2));
        sort($occurences);
    }

    // $dictionary is an array that gets filled with the values with a recursive method
    $dictionary = [];
    fillDictionary($dictionary, is_array($occurences[0][1]) ? $occurences[0][1] : $occurences);

    // Generate the final encoded message
    $encoded = '';
    for($i = 0; $i < strlen($originalString); $i++) {
        $encoded .= $dictionary[$originalString[$i]];
    }
    return $encoded;
}

// This function runs recursively to generate the Huffmann tree
function fillDictionary(&$dictionary, $data, $value = '') {
    if (!is_array($data[0][1])) {
        $dictionary[$data[0][1]] = $value.'0';
    } else {
        fillDictionary($dictionary, $data[0][1], $value.'0');
    }
    if (isset($data[1])) {
        if (!is_array($data[1][1])) {
            $dictionary[$data[1][1]] = $value.'1';
        } else {
            fillDictionary($dictionary, $data[1][1], $value.'1');
        }
    }
}

該函數計算出現次數,並使用遞歸子函數生成字典,該字典為每個字符分配一個二進制代碼。

這是一個例子:

// Test the functionality:
echo huffmannEncode('hello world');
// Output: 00100010101101110011110010101111
// And the dictionary:
/* 
[e] => 000
[h] => 001
[r] => 010
[w] => 011
[l] => 10
[o] => 110
[ ] => 1110
[d] => 1111
*/

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM