简体   繁体   中英

How to find repeated words in file with vector C++

My task is that I don't know number of words in a file and the words are repeating several times,but how many times - It's unknown and I have to find that words. I use classes and vector to work with words,and fstream to work with files. But I cannot find resource or algorithm of finding repeating words and I'm so puzzled. I have vector of variable type and I pushed the words in it. It works successfully,I test it with v.size() output. I made all of things except algorithm of finding repeating words,which solve turned difficult to me.

My full code that I wrote:

#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <algorithm>
#include <stdio.h>
#include <iterator>
using namespace std;
class Wording {
private:
    string word;
    vector <string> v;
public:

    Wording(string Alternateword, vector <string> Alternatev) {
        v = Alternatev;
        word = Alternateword;
    }
};
int main() {
    ifstream ifs("words.txt");
    ofstream ofs("wordresults.txt");
    string word;
    vector <string> v;
    Wording obj(word,v);
    while(ifs >> word) v.push_back(word);
    for(int i=0; i<v.size(); i++) {

        //waiting for algorithm
        //ofs << v[i] << endl;
    }
    return 0;
}

Try using a hash map. If you are using gnu c++, it's std::hash_map. In C++11, you could use std::unordered_map, which would give you the same capabilities. Otherwise, hash_map is available from Boost, and probably elsewhere.

Key concept here is hash_map<word, count>.

Is the unique words in input file what you want? If so then you can do this with set (unordered_set if you don't really need them to be sorted) like so:

std::set<std::string> words; //can be changed to unordered_set
std::copy(ifs, std::ifstream(), std::inserter(words, words.begin());
std::copy(words.begin(), words.end(), ostream_iterator<std::string>(ofs));

You can also use vector, but you'll have to sort it and then use unique on it.

I can't compile this code now, so there might be some errors in my code snippet.

If what you want is the number of occurrences of a different words in file then you'll have to use some kind of map, as was already suggested. Of course using vector, sorting it and then counting consecutive words is also an solution, but wouldn't be too clear.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM