简体   繁体   中英

Huffman Coding Algorithm/Data Structures

So we were given the task of writing a compression alg for a .txt of text/numbers (presumably through huffman coding since our professor was very vague)

I have all the lines as keys in a map with frequencies as their values. I'm a little sketchy on how to proceed from here since maps are organized in order by key not value Should I be using a different data structure (not a map) or would it be easy enough to just find the 2 smallest min values every time I wanted to add to the tree? Code below, any help would be awesome!

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <algorithm>
#include <vector>
#include <map>
using namespace std;

int main()
{
    vector <string> words;
    map <string, int> store;
    ifstream infile("file.txt");
    string text;
    while (getline(infile, text))
    {
        istringstream iss(text);
        string input;
        if (!(iss >> input))
            break;
        words.push_back(input);
    }
    int freq = 0;


    while (!words.empty())
    {

        string check = words[0];
        if(check == "") //make sure not reading a blank
        {
            words.erase(remove(words.begin(), words.end(), "")); //remove all blanks
            continue; //top of loop
        }
        check = words[0];
        freq = count(words.begin(), words.end(), check);//calculate frequency
        store.insert(pair<string, int>(check, freq)); //store words and frequency in map
        words.erase(remove(words.begin(), words.end(), check)); //erase that      value entirely from the vector
    }

    map<string, int>::iterator i;

    for(i = store.begin(); i != store.end(); ++i)
    {
        cout << "store[" << i ->first << "] = " << i->second << '\n';
    } 
    return 0;
}

To find the min value you can use a Priority Queue .

A Priority Queue is a data structure that can give you the min or max value from a set of elements. Finding or inserting in it costs O(log(n)) . So in this case, it might be a perfect choice.

C++ has its own built-in Priority Queue.

Here is a simple example of a priority_queue in C++ .

#include <bits/stdc++.h>
using namespace std;

int main()
{
    priority_queue <int> Q;

    Q.push(10);
    Q.push(7);
    Q.push(1);
    Q.push(-3);
    Q.push(4);

    while(!Q.empty()){ // run this loop until the priority_queue gets empty

        int top = Q.top();
        Q.pop();
        cout << top << ' ';

    }
    cout << endl;
    return 0;
}

Output

10, 7, 4, 1, -3

And as you can notice this is in ascending order. That is because:

By default Priority Queue gives the highest value.

So you can either overload the Priority Queue or a very clever trick can be storing the values by inverting their signs and after you pop them from the queue, you can invert the sign again.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM