简体   繁体   中英

Counting same string/word in a text file in C++

I'm trying to count same string/words from a text file in C++.

This is my text file
one two three two
test testing 123
1 2 3

This is my main program

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main(int argc, const char** argv)
{
    int counter = 0;
    int ncounter = 0;
    string str;
    ifstream input(argv[1]);

    while (getline(input, str)) 
    {
        if(str.find("two") != string::npos){counter++;}
        if(str.find('\n') != string::npos){ncounter++;}

        cout << str << endl; //To show the content of the file
    }

    cout << endl;
    cout << "String Counter: " << counter << endl;
    cout << "'\\n' Counter: " << ncounter << endl;

    return 0;
}

I'm using the .find() function to find the string. When I insert an non-existant word, it doesn't count. When I insert the word "two", it counts, but only once.

How come it didn't count 2 times?

And for the carriage return (or return line; \\n), it can't count any. Why is that?

Because the two twos are on the same line and you are searching the line only for one substring.
You can't find the '\\n' because the getline function reads the line up to and without the '\\n'.

Why not use a std::multiset to store the words ? It would do the counting for you, and reading the file into it can be done in one line:

#include <iostream>
#include <fstream>
#include <string>
#include <set>
#include <iterator>

int main(int argc, const char** argv)
{
    // Open the file
    std::ifstream input(argv[1]);

    // Read all the words into a set
    std::multiset<std::string> wordsList = 
        std::multiset<std::string>( std::istream_iterator<std::string>(input),
                                    std::istream_iterator<std::string>());

    // Iterate over every word
    for(auto word = wordsList.begin(); word != wordsList.end(); word=wordsList.upper_bound(*word))
        std::cout << *word << ": " << wordsList.count(*word) << std::endl;

    // Done
    system("pause");
    return 0;
}

Note the last for part - word=wordsList.upper_bound(*word) . Technically you can switch it to simply word++ (then actually it would be better to shorten it to simply for(auto word: wordList ). It ensures each value from the set will only be output once.

It will also list the words themselves without you needing to do it like now inside your current while loop.

Your best bet is going to be to read each line, then tokenize along the white space so you can examine each word individually.

I suspect we're talking about a homework assignment here, so my best answer is to direct you to the c++ reference for std::strtok: http://en.cppreference.com/w/cpp/string/byte/strtok

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM