简体   繁体   中英

How to read a text file in a specific format c++

I am reading a file word by word and I want to first preprocess the word by converting all characters to lower case and remove any non-alphabetic characters except for the punctuation marks:hyphen (-) and apostrophe(') and then display accordingly. I have completed converting words into lowercase but now I need to remove non-alphabetic character except hyphen(-) and apostrophe('). I have no idea how to do that. Can someone please figure this one out?

void WordStats::ReadTxtFile()
{
    std::ifstream ifile(Filename);
    if(!ifile)
    {
        std::cerr << "Error Opening file " << Filename << std::endl;
        exit(1);
    }

    for (std::string word; ifile >> word; )
    {
        transform (word.begin(), word.end(), word.begin(), ::tolower);
        WordMap & Words = (Dictionary.count(word) ? KnownWords : 
        UnknownWords);
        Words[word].push_back(ifile.tellg()); 
    }

    std::cout << KnownWords.size() << " known words read." << std::endl;
    std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}

You could use std::remove_if

word.erase(std::remove_if(word.begin(), word.end(), [](char c)
{
    return (c < 'a' || c > 'z') && c != '\'' && c != '-';
}),  word.end());

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM