[英]How do I read words from a file, assign them to an array and analyze its content?
I (a student whose professor encourages online research to complete projects) have an assignment where I have to analyze the contents of a file (frequency of certain words, total word cout, largest and smallest word) and I'm getting stuck on even opening the file so the program can get words out.我(一位教授鼓励在线研究完成项目的学生)有一项作业,我必须分析文件的内容(某些单词的频率、总单词数、最大和最小单词),我什至无法打开文件,以便程序可以输出单词。 I've tried to just count the words that it reads and i get nothing.
我试着只计算它读到的单词,但我什么也没得到。 As I understand it, the program should be opening the selected .txt file, going through its contents word by word and outputing it right now.
据我了解,该程序应该打开选定的 .txt 文件,逐字浏览其内容并立即输出。
Here's code:这是代码:
#include <iostream>
#include <string>
#include <cctype>
#include <fstream>
#include <sstream>
string selected[100];
//open selected file.
ifstream file;
file.open(story.c_str());
string line;
if (!file.good())
{
cout << "Problem with file!" << endl;
return 1;
}
while (!file.eof())
{
getline(file, line);
if (line.empty())
continue;
istringstream iss(line);
for (string word; iss >> word;)
cout << word << endl;
```
Because of the simplicity of the attached code, I will not give detailed explanations here.由于所附代码比较简单,这里就不做详细说明了。 With the usage of
std::algorithm
every task can be performed in a one-liner.通过使用
std::algorithm
每个任务都可以单线执行。
We will read the complete source file into one std::string
.我们将完整的源文件读入一个
std::string
。 Then, we define a std::vector
and fill it with all words.然后,我们定义一个
std::vector
并用所有单词填充它。 The words are defined by an ultra simple regex.这些词是由一个超简单的正则表达式定义的。
The frequency is counted with a standard approach using std::map
.使用
std::map
使用标准方法计算频率。
#include <fstream>
#include <string>
#include <iterator>
#include <vector>
#include <regex>
#include <iostream>
#include <algorithm>
#include <map>
// A word is something consiting of 1 or more letters
std::regex patternForWord{R"((\w+))"};
int main() {
// Open file and check, if it could be opened
if (std::ifstream sampleFile{ "r:\\sample.txt" }; sampleFile) {
// Read the complete File into a std::string
std::string wholeFile(std::istreambuf_iterator<char>(sampleFile), {});
// Put all words from the whole file into a vector
std::vector<std::string> words(std::sregex_token_iterator(wholeFile.begin(), wholeFile.end(), patternForWord, 1), {});
// Get the longest and shortest word
const auto [min, max] = std::minmax_element(words.begin(), words.end(),
[](const std::string & s1, const std::string & s2) { return s1.size() < s2.size(); });
// Count the frequency of words
std::map<std::string, size_t> wordFrequency{};
for (const std::string& word : words) wordFrequency[word]++;
// Show the result to the user
std::cout << "\nNumber of words: " << words.size()
<< "\nLongest word: " << *max << " (" << max->size() << ")"
<< "\nShortest word: " << *min << " (" << min->size() << ")"
<< "\nWord frequencies:\n";
for (const auto& [word, count] : wordFrequency) std::cout << word << " --> " << count << "\n";
}
else {
std::cerr << "*** Error: Problem with input file\n\n";
}
return 0;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.