简体   繁体   English

霍夫曼解码功能反复解压缩一个字符

[英]Huffman Decoding Function Uncompressing One Character Repeatedly

I have a program that produces a Huffman tree based on ASCII character frequency read in a text input file. 我有一个程序可以根据在文本输入文件中读取的ASCII字符频率生成霍夫曼树。 The Huffman codes are stored in a string array of 256 elements, empty string if the character is not read. 霍夫曼码存储在由256个元素组成的字符串数组中;如果未读取字符,则为空字符串。 This program also encodes and compresses an output file. 该程序还对输出文件进行编码和压缩。

I am now trying to decompress and decode my current output file which is opened as an input file and a new output file is to have the decoded message identical to the original text input file. 我现在正在尝试解压缩和解码我当前的输出文件,该文件作为输入文件打开,而新的输出文件将使解码后的消息与原始文本输入文件相同。

My thought process for this part of the assignment is to recreate a tree with huffman codes and then while reading 8 bits at a time, traverse through tree until I reach a leaf node where I will have updated an empty string(string answer) and then output it to my output file. 我对作业的这一部分的思考过程是用霍夫曼代码重新创建一棵树,然后一次读取8位,遍历树直到到达叶节点,在那里我将更新一个空字符串(字符串答案),然后将其输出到我的输出文件。

My problem: After writing this function I see that only one character in between all of the other characters of my original input file gets output repeatedly. 我的问题:编写此函数后,我发现原始输入文件的所有其他字符之间只有一个字符被重复输出。 I am confused as to why this is the case because I am expecting the output file to be identical to the original input file. 对于这种情况,我感到困惑,因为我期望输出文件与原始输入文件相同。

Any guidance or solution to this problem is appreciated. 对此问题的任何指导或解决方案均不胜感激。

(For encodedOutput function, fileName is the input file parameter, fileName2 is the output file parameter) (对于encodedOutput函数,fileName是输入文件参数,fileName2是输出文件参数)

(For decodeOutput function, fileName2 is the input file parameter, fileName 3 is output file parameter) (对于decodeOutput函数,fileName2是输入文件参数,fileName 3是输出文件参数)

code[256] is a parameter for both of these functions and holds the Huffman code for each unique character read in the original input file, for example, the character 'H' being read in the input file may have a code of "111" stored in the code array for code[72] at the time it is being passed to the functions. code [256]是这两个函数的参数,并保存原始输入文件中读取的每个唯一字符的霍夫曼代码,例如,输入文件中读取的字符“ H”可能具有代码“ 111”在将代码[72]传递给函数时存储在代码数组中。

freq[256] holds the frequency of each ascii character read or holds 0 if it is not in original input file. freq [256]保存每个ASCII字符的读取频率,如果它不在原始输入文件中,则保持0。

void encodeOutput(const string & fileName, const string & fileName2, string code[256]) {
    ifstream ifile;
    ifile.open(fileName, ios::binary);
    if (!ifile)
    {
        die("Can't read again");
    }
    ofstream ofile;
    ofile.open(fileName2, ios::binary);
    if (!ofile) {
        die("Can't open encoding output file");
    }
    int read;
    read = ifile.get();
    char buffer = 0, bit_count = 0;
    while (read != -1) {
        for (unsigned b = 0; b < code[read].size(); b++) { // loop through bits (code[read] outputs huffman code)
            buffer <<= 1;
            buffer |= code[read][b] != '0';
            bit_count++;
            if (bit_count == 8) {
                ofile << buffer;
                buffer = 0;
                bit_count = 0;
            }
        }
        read = ifile.get();
    }

    if (bit_count != 0)
        ofile << (buffer << (8 - bit_count));

    ifile.close();
    ofile.close();
}
// Work in progress
void decodeOutput(const string & fileName2, const string & fileName3, string code[256], const unsigned long long freq[256]) {
    ifstream ifile;
    ifile.open(fileName2, ios::binary);
    if (!ifile)
    {
        die("Can't read again");
    }
    ofstream ofile;
    ofile.open(fileName3, ios::binary);
    if (!ofile) {
        die("Can't open encoding output file");
    }
    priority_queue < node > q;
    for (unsigned i = 0; i < 256; i++) {
        if (freq[i] == 0) {
            code[i] = "";
        }
    }

    for (unsigned i = 0; i < 256; i++)
        if (freq[i])
            q.push(node(unsigned(i), freq[i]));

    if (q.size() < 1) {
        die("no data");
    }

    while (q.size() > 1) {
        node *child0 = new node(q.top());
        q.pop();
        node *child1 = new node(q.top());
        q.pop();
        q.push(node(child0, child1));
    } // created the tree
    string answer = "";
    const node * temp = &q.top(); // root 
    for (int c; (c = ifile.get()) != EOF;) {
        for (unsigned p = 8; p--;) { //reading 8 bits at a time 
            if ((c >> p & 1) == '0') { // if bit is a 0
                temp = temp->child0; // go left
            }
            else { // if bit is a 1
                temp = temp->child1; // go right
            }
            if (temp->child0 == NULL && temp->child1 == NULL) // leaf node
            {
                ans += temp->value;
                temp = &q.top();
            }
            ofile << ans;
        }
    }
}
(c >> p & 1) == '0'

Will only return true when (c >> p & 1) equals 48, so your if statement will always follow the else branch. 仅在(c >> p & 1)等于48时返回true,因此您的if语句将始终跟随else分支。 The correct code is: 正确的代码是:

(c >> p & 1) == 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM