如何编写一种算法来找出特定单词出现在哪些行中（我正在使用std :: map）

Question

我正在编写代码，计算每个单词在文本中出现的次数（我有这个任务），但是我找不到一种方法来计算这些单词出现在哪些行中。

我不知道从哪里开始。


#include "header.h"
int main()
{
     //read the input, keeping track of each word and how often we see it
     std::ifstream in("tekstas.txt"); // input file

    std::string input;
    std::map<std::string, int> counters; // store each word and an associated counter
    std::vector<char>CharVect; // vector that stores symbols i want to replace
    formuojuChar(CharVect); // pushbacking that vector with symbols
     for (unsigned int i = 0; !in.eof(); i++)
    {
        std::getline(in, input);
        std::transform(input.begin(), input.end(), input.begin(), ::tolower); // lowering letters so for example "Mom"= "mom"
        Replace(input,CharVect, ' '); // replace symbols with space
        std::stringstream read(input);
        std::string word;
        while (read >> word)
        {
            ++counters[word];
        }
     }
     std::ofstream out("isvestis.txt");
    std::cout<<"Words that appear more than once in the text: "<<std::endl;
    std::cout<<std::endl;
     for (std::map<std::string, int>::const_iterator it = counters.begin();it != counters.end(); ++it)
        {
            if((it->second)>1)
            {

                std::cout<<"'" <<it->first<<"' " <<"appears "<<it->second <<" times in lines: " ;
                /*
                 ANY IDEAS ?
                 */
                std::cout<<std::endl;

            }
        }
        return 0;

}

我希望输出显示该单词出现在哪个.txt文件行中。 TY

Answer 1

这看起来像是您想自己做的学习练习，并且我有一项政策是不为那些人编写代码。

但是，您可以做的一件事就是计算遇到的换行数（告诉您正在换行），并且每当您看到要搜索的文本时，就将当前行号插入到std::set<unsigned>或std::vector<unsigned> 。

您可能希望在一个循环中完成此操作，也许一次要一行读取。 遇到搜索词时，请同时更新单词计数器和行号集。

Answer 2

方法的主要问题是，您正在使用第二个循环来收集有关单词的信息，这样做时，您丢失了有关单词所在行的所有信息。

您不必试图弄清楚您在第二个循环中处于哪一行，而是在第一个循环中具有当前行所必需的所有信息。 您只需要一个变量来跟踪每一行。 您正在使用（我可能会错误地添加） std::getline －每次调用该函数时，您都将转到下一行，因此您隐式知道了第一个循环中的哪一行。

首先，您需要修复读取循环，以便它可以从文件中正确读取行：

std::string line;
while (std::getline(in, line))
{
//...
}

其次，在while循环内，您可以确定单词，单词计数和找到单词的行所需的所有信息。 您不需要两个循环即可执行此操作。

代替只知道单词计数的std::map<std::string, int> ，您可以创建一个包含所有信息的映射-单词计数和找到该单词的行。 这是可以保存此信息的地图类型：

std::map<std::string, std::pair<int, std::set<int>>>

映射的“第二”保存有关计数的信息，以及一个std::set ，它将保存找到单词的所有行号。 使用std::set的原因是为了确保不会存储重复的行号。

放在一起，这是使用此类型的示例程序：

#include <map>
#include <set>
#include <string>
#include <sstream>
#include <iostream>

// pair and map type
using WordInfo = std::pair<int, std::set<int>>;
using WordMap = std::map<std::string, WordInfo>;

int main()
{
    // our map
    WordMap wm;
    std::string line;

    // the line count
    int line_number = 1;
    while (std::getline(std::cin, line))
    {
        // line parser
        std::istringstream strm(line);
        std::string word;
        while ( strm >> word)
        {
            // we call map::insert, not `[ ]` to insert into a map
            auto pr = wm.insert({word, {0,std::set<int>()}});

            // the return value of map::insert gives us a pair, where the first is 
            // an iterator to the item in the map
            auto& mapIter = pr.first;

            // increment the word count   
            ++(mapIter->second.first);

            // insert the line number into the set
            mapIter->second.second.insert(line_number);
        }

        // increment the line counter
        ++line_number;
    }

    // output results
    for (auto& m : wm )
    {
        std::cout << "The word  \"" << m.first << "\" appears " << m.second.first << " times on the following lines:\n";
        for ( auto& m2 : m.second.second)
            std::cout << m2 << " ";
        std::cout << "\n\n";
    }
}

那么在这里做了什么？

1）每个字所在的行在读取循环中是已知的。 所有要做的就是增加读入的每一行的行数。

2）我们使用std::map::insert将条目插入地图，而不是 std::map::operator[ ] 。 原因是map::insert如果该条目已经存在，则不会插入该条目；如果该条目不存在，则它将插入一个全新的条目，并且无论执行了什么操作， std::map::insert返回一个迭代器到地图中的项目。

我们需要返回给我们的迭代器以进行后续处理。 在接下来的几行中，我们只是增加计数并更新std::set 。

这是一个实时示例。

注意：我不知道您在原始程序中要执行的所有替换操作，因此我跳过了所有这些操作，仅专注于确定单词和单词所在行的任务。

如何编写一种算法来找出特定单词出现在哪些行中（我正在使用std :: map）

问题描述

2 个解决方案

解决方案1
1 2019-05-26 21:44:01

解决方案2
0 2019-05-26 22:34:04

如何编写一种算法来找出特定单词出现在哪些行中（我正在使用std :: map）

问题描述

2 个解决方案

解决方案1 1 2019-05-26 21:44:01

解决方案2 0 2019-05-26 22:34:04

解决方案1
1 2019-05-26 21:44:01

解决方案2
0 2019-05-26 22:34:04