簡體   English   中英

計算段落中出現次數最多的單詞

[英]Counting the most appearance word in the paragraph

今天,我的代碼遇到了一些問題。 請求是讀取一個包含

“今天是星期天。明天是星期一。昨天是星期六。”

並計算句子中單詞的數量,段落中句子的數量,找到段落中出現次數最多的單詞,然后寫入文件。 我已經完成了前兩個請求,但是最后一個,當我運行代碼時,它來了:

“星期一”,還是什么也沒有。

那么我可以尋求一些建議來解決我的問題嗎? 代碼如下。 非常感謝!

#include <algorithm>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main()
{

ifstream is;
is.open("test.txt", ios::in);
string total = "";
if (is.is_open())
{
    string line = "";
    while (getline(is, line))
    {
        total += line;
    }

    is.close();
}
ofstream os;
os.open("tes.txt", ios::out);
os << total << endl;
os.close();
vector<string> sen_vector;
size_t prev_pos = 0;
size_t cur_pos = total.find_first_of("!?.");
while (cur_pos != string::npos)
{
    string sen = total.substr(prev_pos, cur_pos - prev_pos);
    sen_vector.push_back(sen);
    prev_pos = cur_pos + 2;
    cur_pos = total.find_first_of("!?.", prev_pos);
}
vector<vector<string>> para_vector;

for (int i = 0; i < sen_vector.size(); i++)
{
    vector<string> temp;

    string sen = sen_vector[i] + " ";
    size_t prev_pos_w = 0;
    size_t cur_pos_w = sen.find(' ', prev_pos_w);
    while (cur_pos_w != string::npos)
    {
        string word = sen.substr(prev_pos_w, cur_pos_w - prev_pos_w);
        temp.push_back(word);
        prev_pos_w = cur_pos_w + 1;
        cur_pos_w = sen.find(' ', prev_pos_w);
    }
    para_vector.push_back(temp);
}

for (int i = 0; i < para_vector.size(); i++)
{
    for (int j = 0; j < para_vector[i].size(); j++)
    {
        cout << para_vector[i][j] << ' ';
    }
}
cout << endl;
cout << "So cau trong doan: " << size(para_vector) << endl; // Amount of sentences in a paragraph.
for (int i = 0; i<sen_vector.size(); i++)
    cout << "So tu trong cau " << i + 1 << " la: " << size(para_vector[i]) << endl; // Amount of words in a sentence.
string a[100], d[100];
int n = 0;
for (int i = 0; i < sen_vector.size(); i++) // From sentence to sentence-array
{
    a[i] = sen_vector[i] + " ";
    n++;
}
cout << endl; 
int dem = 0, m = 0, vt = 0;
int b[100], dt = 0;
for (int i = 0; i < sen_vector.size(); i++)  // From sentence-array to word-array
{
    size_t prev_pos_w = 0;
    size_t cur_pos_w = a[i].find(' ', prev_pos_w);

    for (int j = 0; j < n; j++)
    {
        while (cur_pos_w != string::npos)
        {
            d[i] = a[i].substr(prev_pos_w, cur_pos_w - prev_pos_w);
            prev_pos_w = cur_pos_w + 1;
            cur_pos_w = a[i].find(' ', prev_pos_w);
            cout << d[i] << " ";
            dt++;
        }

    }
}

/*for (int i = 0; i < dt-1; i++)    // I got confused with these code (it came nothing when ran)
{
    for (int j = 1; j < dt; j++) 
    {
        if (d[i] == d[j])
        {
            count++;
        }
    }
    b[i] = count;
}
int max = 0;
for (int i = 0; i <= n; i++)
{
    if (max < b[i])
    {
        max = b[i];
        vt = i;
    }
}
cout << d[vt];*/
system("pause");
return 0;


}

我將使用std::multiset ,每個單詞存儲被發現多少次。

std::multiset<std::string> word_set;

std::string word;
while (is >> word) {
    word_set.insert(word); // it might be a good idea to remove non-word chars
}

然后,您可以遍歷元素,並返回具有最高多重性的元素:

std::string most_seen = "";
int count = 0;

for (std::string i : word_set) {
    if (word_set.count(i) > count) 
        most_seen = i;
}
return most_seen;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM