简体   繁体   English

使用霍夫曼编码 (Java) 压缩和解压缩 small.png 文件时出现的问题

[英]Problems when compressing and decompressing a small .png file using Huffman Coding (Java)

So I have a Java class that implements Hufmman Coding and I want to use it to compreess and decompress any type of file.所以我有一个 Java class 实现了 Hufmman 编码,我想用它来压缩和解压缩任何类型的文件。

Here is my code:这是我的代码:

import java.io.*;
import java.util.*;

public class HuffmanCoding {

    public static void main(String[] args) throws IOException {

        String inputFilePath = "C:\\Users\\MAJ\\eclipse-workspace\\ProjectTwo\\src\\inputFile.png";
        String encodedOutputFilePath = "C:\\Users\\MAJ\\eclipse-workspace\\ProjectTwo\\src\\encodedOutputFile.txt";
        // get the frequencies of all the bytes in the file
        byte[] data = fileToByteArray(inputFilePath);
        Map<Byte, Integer> frequencyTable = getByteFrequencies(data);

        // create a Huffman coding tree
        Node root = createHuffmanTree(frequencyTable);

        // create the table of encodings for each byte
        Map<Byte, String> encodings = createEncodings(root);

        // encode the input file and write the encoded output to the output file
        encodeFile(data, encodings, encodedOutputFilePath);
        String inputFileExtension = inputFilePath.substring(inputFilePath.lastIndexOf('.'));
        String decompressedOutputFilePath = "C:\\Users\\MAJ\\eclipse-workspace\\ProjectTwo\\src\\decompressedOutputFile" + inputFileExtension;
        decodeFile(encodedOutputFilePath, decompressedOutputFilePath, root);
    }

    public static byte[] fileToByteArray(String filePath) throws IOException {
        // read the file
        BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(filePath));
        byte[] data = inputStream.readAllBytes();
        inputStream.close();

        return data;
    }


    public static Map<Byte, Integer> getByteFrequencies(byte[] data) {
        // map for storing the frequencies of the bytes
        Map<Byte, Integer> frequencyTable = new HashMap<>();

        // count the frequencies of the bytes
        for (byte b : data) {
            frequencyTable.put(b, frequencyTable.getOrDefault(b, 0) + 1);
        }

        return frequencyTable;
    }

    public static Node createHuffmanTree(Map<Byte, Integer> frequencyTable) {
        // create a priority queue to store the nodes of the tree
        PriorityQueue<Node> queue = new PriorityQueue<>(Comparator.comparingInt(n -> n.frequency));

        // create a leaf node for each byte and add it to the priority queue
        for (Map.Entry<Byte, Integer> entry : frequencyTable.entrySet()) {
            queue.add(new Node(entry.getKey(), entry.getValue()));
        }

        // create the Huffman tree
        while (queue.size() > 1) {
            // remove the two nodes with the lowest frequency from the queue
            Node left = queue.poll();
            Node right = queue.poll();

            // create a new internal node with these two nodes as children and the sum of their frequencies as the frequency
            assert right != null;
            Node parent = new Node(left.frequency + right.frequency, left, right);

            // add the new internal node to the queue
            queue.add(parent);
        }

        // the root node is the node remaining in the queue
        return queue.poll();

    }


    // node class for the Huffman tree
    static class Node {
        int frequency;
        byte character;
        Node left;
        Node right;

        Node(int frequency, Node left, Node right) {
            this.frequency = frequency;
            this.left = left;
            this.right = right;
        }

        Node(byte character, int frequency) {
            this.character = character;
            this.frequency = frequency;
        }
    }

    public static Map<Byte, String> createEncodings(Node root) {
        // map for storing the encodings of the bytes
        Map<Byte, String> encodings = new HashMap<>();

        // create the encodings
        createEncodings(root, "", encodings);

        return encodings;
    }

    private static void createEncodings(Node node, String encoding, Map<Byte, String> encodings) {
        if (node == null) {
            return;
        }
        if (node.character != 0) {
            // this is a leaf node, so add the encoding to the map
            encodings.put(node.character, encoding);
        } else {
            // this is an internal node, so recurse on the left and right children
            createEncodings(node.left, encoding + "0", encodings);
            createEncodings(node.right, encoding + "1", encodings);
        }
    }



    public static void encodeFile(byte[] data, Map<Byte, String> encodings, String outputFilePath) throws IOException {
        BufferedWriter writer = new BufferedWriter(new FileWriter(outputFilePath));

        // create a string builder for building the encoded string
        StringBuilder sb = new StringBuilder();

        // encode the data and add the encoded string to the string builder
        for (byte b : data) {
            String str = encodings.get(b);
            if (str == null) {
                str = "";
            }
            sb.append(str);
        }

        // write the encoded string to the output file
        writer.write(sb.toString());

        writer.close();
    }




    public static void decodeFile(String inputFilePath, String outputFilePath, Node root) throws IOException {
        // read the encoded data from the input file
        BufferedReader reader = new BufferedReader(new FileReader(inputFilePath));
        String encodedData = reader.readLine();
        reader.close();

        // create the output file
        BufferedOutputStream outputStream = new BufferedOutputStream(new FileOutputStream(outputFilePath));

        // decode the data and write it to the output file
        Node current = root;
        for (int i = 0; i < encodedData.length(); i++) {
            current = encodedData.charAt(i) == '0' ? current.left : current.right;
            assert current != null;
            if (current.left == null && current.right == null) {
                outputStream.write(current.character);
                current = root;
            }
        }
        outputStream.close();
    }




}

When compressing and decompressing a.txt file, everything works fine, but when compressing & decompressing a small.png image of size 5 KB, the outputted decompressed file, which should be an identical.png image to the original one, has the correct size but when I try to open it with any kind of image viewer it doesn't load, and I can't seem to figure out what the problem is, and I'm assuming the same problem will occur with any other kinds of files (.mp4, .mp3, .jpeg, .exe, etc...).压缩和解压缩 a.txt 文件时,一切正常,但是当压缩和解压缩大小为 5 KB 的 small.png 图像时,输出的解压缩文件(应该是与原始图像相同的 .png 图像)具有正确的大小但是当我尝试使用任何类型的图像查看器打开它时,它不会加载,而且我似乎无法弄清楚问题是什么,我假设任何其他类型的文件都会出现同样的问题( .mp4、.mp3、.jpeg、.exe 等)。 Please help me out if you can!如果可以,请帮帮我!

You can't have a "special" character if you want to be able to code all possible bytes.如果您希望能够对所有可能的字节进行编码,则不能使用“特殊”字符。 Also you don't need one.你也不需要一个。 Leaves are already identified by null pointers.叶子已经被 null 个指针识别。 If you change:如果你改变:

if (node.character != 0) {

to:到:

if (node.left == null) {

then it works.然后就可以了。

You still have a ways to go before you have a working Huffman coder and decoder.在你有一个工作的霍夫曼编码器和解码器之前,你仍然有办法达到 go。 You need to write bits instead of bytes, so that you're not dramatically expanding your data instead of compressing it.您需要写入位而不是字节,这样您就不会显着扩展数据而不是压缩数据。 Having done that, now you'll need to deal with the extra bits in the last byte, to make sure the decoder doesn't decode an extraneous symbol or two at the end.完成后,现在您需要处理最后一个字节中的额外位,以确保解码器不会在末尾解码一两个无关的符号。 To do that, you'll need to either send the number of symbols ahead of the symbols, or encode an additional end-of-stream symbol.为此,您需要在符号之前发送符号数,或者对附加的流结束符号进行编码。 You need to represent and encode the Huffman code at the start of the compressed data, so that the decoder can interpret the codes.您需要在压缩数据的开头表示和编码霍夫曼代码,以便解码器可以解释这些代码。 You need to demonstrate your encoder and decoder work by making them separate programs so that the only thing the decoder has to go on is the one compressed file.您需要通过将编码器和解码器分开的程序来演示它们的工作,以便解码器唯一需要 go 的是一个压缩文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM