Getline和EOF

Question

I am trying to read a file. 我正在尝试读取文件。 The file contents have a newline between words in a sentence and two newlines between sentence. 文件内容在句子中的单词之间有一个换行符，在句子之间有两个换行符。 I can only read one sentence. 我只能读一句话。 I have tried to put a EOF as a delimiter in getline but it seems not to work. 我试图将EOF作为getline中的定界符，但似乎不起作用。 Does anyone have any suggestions on how to resolve this? 有人对如何解决这个问题有任何建议吗？

The file contents are: 文件内容为：

County 县

Grand 盛大

Jury 陪审团

said Friday an investigation of Atlanta's recent primary 周五说，对亚特兰大最近的初诊
election produced `` . 选举产生了。 no evidence '' . 没有证据 '' 。 that any 那个
irregularities took place . 发生违规行为。 . 。 The jury further said in 陪审团进一步表示
term-end presentments that the City Executive Committee 市执行委员会的期末报告

But what's get printed is: 但是打印出来的是：

County 县

Grand Jury said Friday an investigation of Atlanta's 大陪审团周五表示，对亚特兰大
recent primary election produced `` . 最近的初选产生了。 no evidence '' . 没有证据 '' 。 that any irregularities took place . 任何违规行为发生了。 . 。

string line;
string a, b;
ifstream infile("myFile");

 while (getline(infile, line))
{
    istringstream iss(line);

    if (!(iss >> a >> b)) { break; } // error

    cout << a << b << endl;
}

Answer 1

#include <iostream>
#include <vector>
#include <boost/tokenizer.hpp>

using namespace std;

typedef boost::tokenizer<boost::char_separator<char>,
        std::istreambuf_iterator<char> >
    tokenizer;

void printPhrase(const vector<string>& _phrase) {
    if (!_phrase.empty()) {
        vector<string>::const_iterator it = _phrase.begin();
        cout << "Phrase: \"" << *it;
        for(++it; it != _phrase.end(); ++it)
            cout << "\", \"" << *it;
        cout << "\"" << endl;
    } else
       cout << "Empty phrase" << endl;
}

int main() {
    boost::char_separator<char> sep("", "\n", boost::drop_empty_tokens);
    istreambuf_iterator<char> citer(cin);
    istreambuf_iterator<char> eof;
    tokenizer tokens(citer, eof, sep);

    int eolcount = 0;
    vector<string> phrase;
    for (tokenizer::iterator it = tokens.begin(); it != tokens.end(); ++it) {
        if (*it == "\n") {
            eolcount ++;
            if (eolcount > 1 && eolcount % 2 == 0) { // phrase end
                printPhrase(phrase);
                phrase.clear();
            }
        } else {
            eolcount = 0;
            phrase.push_back(*it);
        }
    }
    if (!phrase.empty())
        printPhrase(phrase);
    return 0;
}

The basic idea is to keep newlines in the output, count them and if there're 2, 4, .. even number of sequential newlines print words collected so far. 基本思想是将换行符保留在输出中，对它们进行计数，如果到目前为止已收集到2、4，..偶数个连续换行符，则会打印出单词。 A non-newline token breaks the sequence, and this token is added into the accumulator. 非换行标记会破坏序列，并且此标记会添加到累加器中。

Getline和EOF

问题描述

1 个解决方案

解决方案1
0 2014-04-20 02:27:15

Getline和EOF

问题描述

1 个解决方案

解决方案1 0 2014-04-20 02:27:15

解决方案1
0 2014-04-20 02:27:15