简体   繁体   中英

huffman decoding from a given file

I am given a string of characters that I have to encode and decode using the ideas of the huffman tree. The original string of characters is uploaded as a file, and the return for the encode method is a new file containing the translation table and the encoded message, in order.

Here is an example of what an output file would contain.

{0=0100, 1=0111, 2=11, 3=00, 4=10, 5=0110, =0101}

0110101100111101110100011100111000100011001011100101

My encode method:

 Map<Character, Integer> freq = new HashMap<>(); //store char, freq in map
        for (char c: message.toCharArray()) { //thank you enhanced for loop
            freq.put(c, freq.getOrDefault(c, 0) + 1);
        }

        //queue stores nodes
        PriorityQueue<Node> pq;
        pq = new PriorityQueue<>(Comparator.comparingInt(l -> l.freq)); 
        //nodes will be sorted by comparator

        //using hashmap we add the nodes to the queue
        for (var entry : freq.entrySet()) {
            pq.add(new Node(entry.getKey(), entry.getValue()));
        }

        while (pq.size() != 1) //must be more than 1 node
        {
            //removes the two nodes of lowest freq
            Node left = pq.poll();
            Node right = pq.poll();

            //connect new nodes
            int sum = left.freq + right.freq; //summary of freq, for parent
            pq.add(new Node('\0', sum, left, right)); //add back
        }

        //root of tree
        Node root = pq.peek();

        //new map for huffman codes
        Map<Character, String> huffmanCode = new HashMap<>();
        coding(root, "", huffmanCode);
        

        //testing
        //print the Huffman codes so we can know them (move this later)
        //System.out.println("Huffman code is " + huffmanCode);

        //testing
        //print encoded message
        StringBuilder sb = new StringBuilder();
        for (char c: message.toCharArray()) {
            sb.append(huffmanCode.get(c));
        }
        //System.out.println("Encoded string is : " + sb); //check file format

         Pair<Map,StringBuilder> p = new Pair<Map,StringBuilder>(huffmanCode,sb);
        
        return p;

I convert the pair into a string in the main method.

In order to decode this string, I am supposed to re-upload the file and use the information it contains to create a new file with the original uncoded message. I would have no problem decoding the message if it weren't for the uploading requirement. I lose access to the huffman tree I previously encoded and I have been unable to preserve it in the output files as I am required to only return strings.

I have tried splitting strings and working backwards from the table, but without the tree I am really struggling to decode the string without the huffman tree. Is there any way to preserve the tree in the output file, or is there a way to reconstruct the tree using the translation table? Thanks.

You would need to represent the tree as a sequence of bytes in order to write it to a file. That can certainly be done, one example being giving each node a identification number, and representing each node as two such numbers and a symbol.

However you don't need the tree if you construct your Huffman code canonically . The only information you need to send in order to decode canonical Huffman codes is the code length (in bits) for each symbol. That is what is normally done.

The list of code lengths can itself be compressed in many ways to reduce the overhead of the code description.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM