简体   繁体   中英

im counting the number of characters in a file but i want to count the number of words that are less than 5 and 6 or greater

i want to do this: reads the words in the file one at a time. (Use a string to do this) Counts three things: how many single-character words are in the file, how many short (2 to 5 characters) words are in the file, and how many long (6 or more characters) words are in the file. HELP HERE

im not sure on how about reading file into a string. i know i have to something like this but i dont understand the rest. HELP HERE

ifstream infile;
//char mystring[6];
//char mystring[20];

 int main()
    {
        infile.open("file.txt");
            if(infile.fail())
            {
                cout << " Error " << endl;
            }

        int numb_char=0;
        char letter;

                while(!infile.eof())
                {
                    infile.get(letter);
                    cout << letter;
                    numb_char++;
                    break;
                }

    cout << " the number of characters is :" << numb_char << endl;
    infile.close(); 
    return 0;

I'm not quite sure where to start...

Your loop:

while(!infile.eof())
{
  infile.get(letter);
  cout << letter;
  numb_char++;
  break;
}

Would only execute once due to the extra break;

Also this code looks like it is trying to read the number of characters in a file, and not count up the number of words that are 5 letters or greater than 6 letters.

Try something like:

ifstream infile;

int main(){
  infile.open("file.txt");
  if(!infile.good()){
    cout << " Error " << endl;
    return 1;
  }
  int shortCount = 0;
  int mediumCount = 0;
  int longCount = 0;
  int charCount = 0;
  char letter;
  while(!infile.eof()){
    infile >> letter;
    if(letter == ' ' || char == EOF){ // end of word or file.
      if(charCount == 1)
        shortCount++;
      else if(charCount < 6)
        mediumCount++;
      else
        longCount++;
      charCount = 0;
    }else{
      charCount++;
    }
  }
  cout << "Short Words: " << shortCount << endl;
  cout << "Medium Words: " << mediumWords << endl;
  cout << "Long Words: " << longWords << endl;
  infile.close();
  return 0;
}

Could be a Unicode issue, you might want to check the encoding of the file, if it is Unicode you will need to use the appropriate methods wfstream , and types wchar_t . Unicode is becoming increasingly common and I would not be surprised if that was the source of your problem.

As I mentioned ... you're reading a single character then breaking out of your loop ... don't break .

As for how to do this ... one approach would be to define 3 counters, int fiveMinusLetterWord , int sixPlusLetterWord , and int singleLetterWord . Count characters until letter == ' ' . When you hit space, see how many characters you've read - that's the length of the previous word. Increment one of your counters if needed, reset your character counter, and proceed untl the end of the file. Remember to check the length of the last word after the loops exits. You also are going to have to deal with end-of-line delimiters since you're reading a single character at a time.

An easier approach since this is C++ would be to use istream& getline ( istream& is, string& str ); from <string> and read line by line into a std::string then use the std::string functions to find your words.

EDIT: I missed the part in your question that says "read in one word at a time". Look at the other answer, you can read a single word from a stream using a std::string.

#include <cctype>
#include <string>
#include <vector>
#include <iostream>
using namespace std;

string s;
vector< int > word_length_histogram;

while ( cin >> s ) // attempt to get a word and stop loop upon failure
{
    while ( ispunct( * --s.end() ) ) s.erase( --s.end() ); // strip punctuation

    if ( s.size() >= word_length_histogram.size() ) {
        word_length_histogram.resize( s.size() + 1 );
    } // make sure there's room in the histogram

    ++ word_length_histogram[ s.size() ];
}

At the end, word_length_histogram[1] has the number of 1-character words, word_length_histogram[2] has the number of 2-character words, etc. Add up the contents of ranges within word_length_histogram to get the particular statistics you want.

vector<string> words;
int cLength = 0;
long shortWords, medWords, longWords;

boost::algorithm::split(inputString, is_any_of(" .,-_=+;()[]\\/ [etc]"), words, token_compress_on);
for ( unsigned long i = 0; i < words.size(); i++ )
{
    cLength = words[i].size();
    if ( cLength < 2 ) // short word
    {
        shortWords++;
    } else if ( cLength < 6 ) {
        medWords++;
    } else {
        longWords++;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM