简体   繁体   中英

store huffman tree in java

I am going through a project into file compression and decompression using Huffman code. Here first I need the frequency for every unique character of a file that I want to compress. Then I built a Tree with priority queue of the character frequency of the file.

`public static HuffmanNode buildTree(Map<Character, Integer> freq) {

    PriorityQueue<HuffmanNode> priorityQueue = new PriorityQueue<>();
    Set<Character> keySet = freq.keySet();
    for (Character c : keySet) {

        HuffmanNode huffmanNode = new HuffmanNode();
        huffmanNode.data = c;
        huffmanNode.frequency = freq.get(c);
        huffmanNode.left = null;
        huffmanNode.right = null;
        priorityQueue.offer(huffmanNode);
    }
    assert priorityQueue.size() > 0;

    while (priorityQueue.size() > 1) {

        HuffmanNode x = priorityQueue.peek();
        priorityQueue.poll();

        HuffmanNode y = priorityQueue.peek();
        priorityQueue.poll();

        HuffmanNode sum = new HuffmanNode();

        sum.frequency = x.frequency + y.frequency;
        sum.data = '-';

        sum.left = x;

        sum.right = y;
        root = sum;

        priorityQueue.offer(sum);//Inserts the specified element to the queue. If the queue is full, it returns false.
    }

    return priorityQueue.poll();
}

Then I traverse the tree and store its bit value into the file. when traversing the tree the left child is 0 and the right child os 1 and store it into a file. It the compression part. But my problem is when I want to decompress the file from the compression file I could not decompress it.I think I have to store the tree by serializing,(heard this concept of java). and when decompress I have to traverse the tree. But I dont know how can I serialize or store the tree.can any one help me how can I solve this?

Yes, you need to send a representation of the code before you send the encoded symbols, in order for the decoder to be able to make sense of it.

There are many ways. The most straightforward is to traverse the tree, sending a 0 bit for each branch encountered; or a 1 bit for each leaf branch, followed immediately by, in your case, the eight bits of the symbol for that leaf. On the decoder end you can read those bits and reconstruct the tree. The description is self-terminating, so you follow that immediately with the encoded symbols.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM