简体   繁体   English

为什么 iostream::eof 在循环条件(即 `while (.stream?eof())`)内被认为是错误的?

[英]Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?

I just found a comment in this answer saying that using iostream::eof in a loop condition is "almost certainly wrong".我刚刚在这个答案中发现一条评论说在循环条件中使用iostream::eof “几乎肯定是错误的”。 I generally use something like while(cin>>n) - which I guess implicitly checks for EOF.我通常使用while(cin>>n)类的东西——我猜它会隐式检查 EOF。

Why is checking for eof explicitly using while (.cin.eof()) wrong?为什么明确使用while (.cin.eof())检查 eof 是错误的?

How is it different from using scanf("...",...)!=EOF in C (which I often use with no problems)?它与在 C 中使用scanf("...",...)!=EOF (我经常毫无问题地使用它)有何不同?

Because iostream::eof will only return true after reading the end of the stream.因为iostream::eof只会在读取流的末尾返回true It does not indicate, that the next read will be the end of the stream.并不表示下一次读取将是流的结尾。

Consider this (and assume then next read will be at the end of the stream):考虑一下(并假设下一次读取将在流的末尾):

while(!inStream.eof()){
  int data;
  // yay, not end of stream yet, now read ...
  inStream >> data;
  // oh crap, now we read the end and *only* now the eof bit will be set (as well as the fail bit)
  // do stuff with (now uninitialized) data
}

Against this:反对这一点:

int data;
while(inStream >> data){
  // when we land here, we can be sure that the read was successful.
  // if it wasn't, the returned stream from operator>> would be converted to false
  // and the loop wouldn't even be entered
  // do stuff with correctly initialized data (hopefully)
}

And on your second question: Because关于你的第二个问题:因为

if(scanf("...",...)!=EOF)

is the same as是相同的

if(!(inStream >> data).eof())

and not the same as一样

if(!inStream.eof())
    inFile >> data

Bottom-line top: With proper handling of white-space, the following is how eof can be used (and even, be more reliable than fail() for error checking):底线顶部:通过正确处理空白,以下是如何使用eof (甚至比fail()更可靠的错误检查):

while( !(in>>std::ws).eof() ) {  
   int data;
   in >> data;
   if ( in.fail() ) /* handle with break or throw */; 
   // now use data
}    

( Thanks Tony D for the suggestion to highlight the answer. See his comment below for an example to why this is more robust. ) 感谢 Tony D 提出的强调答案的建议。请参阅下面的评论,了解为什么这更强大的示例。


The main argument against using eof() seems to be missing an important subtlety about the role of white space.反对使用eof()的主要论点似乎缺少关于空白角色的重要微妙之处。 My proposition is that, checking eof() explicitly is not only not " always wrong " -- which seems to be an overriding opinion in this and similar SO threads --, but with proper handling of white-space, it provides for a cleaner and more reliable error handling, and is the always correct solution (although, not necessarily the tersest).我的主张是,明确地检查eof()不仅不是“总是错误的”——这似乎是这个和类似 SO 线程中最重要的观点——而且通过正确处理空白,它提供了一个更清洁的和更可靠的错误处理,并且始终是正确的解决方案(尽管不一定是最简单的)。

To summarize what is being suggested as the "proper" termination and read order is the following:总结建议的“正确”终止和阅读顺序如下:

int data;
while(in >> data) {  /* ... */ }

// which is equivalent to 
while( !(in >> data).fail() )  {  /* ... */ }

The failure due to read attempt beyond eof is taken as the termination condition.超出 eof 的读取尝试失败作为终止条件。 This means is that there is no easy way to distinguish between a successful stream and one that really fails for reasons other than eof.这意味着没有简单的方法可以区分成功的流和由于 eof 以外的原因而真正失败的流。 Take the following streams:采取以下流:

  • 1 2 3 4 5<eof>
  • 1 2 a 3 4 5<eof>
  • a<eof>

while(in>>data) terminates with a set failbit for all three input. while(in>>data)所有三个输入的设置failbit终止。 In the first and third, eofbit is also set.在第一个和第三个中,还设置了eofbit So past the loop one needs very ugly extra logic to distinguish a proper input (1st) from improper ones (2nd and 3rd).因此,在循环之后,需要非常难看的额外逻辑来区分正确的输入(第一个)和不正确的输入(第二个和第三个)。

Whereas, take the following:鉴于,采取以下措施:

while( !in.eof() ) 
{  
   int data;
   in >> data;
   if ( in.fail() ) /* handle with break or throw */; 
   // now use data
}    

Here, in.fail() verifies that as long as there is something to read, it is the correct one.在这里, in.fail()验证只要有要读取的内容,它就是正确的。 It's purpose is not a mere while-loop terminator.它的目的不仅仅是一个while循环终止符。

So far so good, but what happens if there is trailing space in the stream -- what sounds like the major concern against eof() as terminator?到目前为止一切都很好,但是如果流中有尾随空格会发生什么——听起来像eof()作为终止符的主要问题是什么?

We don't need to surrender our error handling;我们不需要放弃我们的错误处理; just eat up the white-space:只是吃掉空白:

while( !in.eof() ) 
{  
   int data;
   in >> data >> ws; // eat whitespace with std::ws
   if ( in.fail() ) /* handle with break or throw */; 
   // now use data
}

std::ws skips any potential (zero or more) trailing space in the stream while setting the eofbit , and not the failbit . std::ws在设置eofbit不是failbit时跳过流中任何潜在的(零个或多个)尾随空格。 So, in.fail() works as expected, as long as there is at least one data to read.因此, in.fail()可以按预期工作,只要至少有一个数据要读取。 If all-blank streams are also acceptable, then the correct form is:如果全空流也可以接受,那么正确的形式是:

while( !(in>>ws).eof() ) 
{  
   int data;
   in >> data; 
   if ( in.fail() ) /* handle with break or throw */; 
   /* this will never fire if the eof is reached cleanly */
   // now use data
}

Summary: A properly constructed while(!eof) is not only possible and not wrong, but allows data to be localized within scope, and provides a cleaner separation of error checking from business as usual.总结:正确构造的while(!eof)不仅可行且不会出错,而且允许数据在范围内本地化,并提供更清晰的错误检查与常规业务分离。 That being said, while(!fail) is inarguably a more common and terse idiom, and may be preferred in simple (single data per read type of) scenarios.话虽如此, while(!fail)无疑是一个更常见和更简洁的习惯用法,并且在简单(每个读取类型的单个数据)场景中可能是首选。

Because if programmers don't write while(stream >> n) , they possibly write this:因为如果程序员不写while(stream >> n) ,他们可能会这样写:

while(!stream.eof())
{
    stream >> n;
    //some work on n;
}

Here the problem is, you cannot do some work on n without first checking if the stream read was successful, because if it was unsuccessful, your some work on n would produce undesired result.这里的问题是,如果不首先检查流读取是否成功,就不能some work on n ,因为如果不成功,你some work on n会产生不希望的结果。

The whole point is that, eofbit , badbit , or failbit are set after an attempt is made to read from the stream.重点是,在尝试从流中读取之后,设置了eofbitbadbitfailbit So if stream >> n fails, then eofbit , badbit , or failbit is set immediately, so its more idiomatic if you write while (stream >> n) , because the returned object stream converts to false if there was some failure in reading from the stream and consequently the loop stops.因此,如果stream >> n失败,则立即设置eofbitbadbitfailbit ,因此如果您编写while (stream >> n)则更惯用,因为如果读取失败,则返回的对象stream将转换为false流,因此循环停止。 And it converts to true if the read was successful and the loop continues.如果读取成功并且循环继续,它将转换为true

The other answers have explained why the logic is wrong in while (!stream.eof()) and how to fix it.其他答案已经解释了为什么while (!stream.eof())中的逻辑错误以及如何修复它。 I want to focus on something different:我想专注于不同的事情:

why is checking for eof explicitly using iostream::eof wrong?为什么使用iostream::eof错误?

In general terms, checking for eof only is wrong because stream extraction ( >> ) can fail without hitting the end of the file.一般而言,检查eof是错误的,因为流提取 ( >> ) 可能会失败而不会到达文件末尾。 If you have eg int n; cin >> n;如果你有例如int n; cin >> n; int n; cin >> n; and the stream contains hello , then h is not a valid digit, so extraction will fail without reaching the end of the input.并且流包含hello ,则h不是有效数字,因此提取将失败而不会到达输入的末尾。

This issue, combined with the general logic error of checking the stream state before attempting to read from it, which means for N input items the loop will run N+1 times, leads to the following symptoms:此问题与在尝试读取流状态之前检查流状态的一般逻辑错误相结合,这意味着对于 N 个输入项,循环将运行 N+1 次,导致以下症状:

  • If the stream is empty, the loop will run once.如果流为空,则循环将运行一次。 >> will fail (there is no input to be read) and all variables that were supposed to be set (by stream >> x ) are actually uninitialized. >>将失败(没有要读取的输入),并且应该设置的所有变量(通过stream >> x )实际上都未初始化。 This leads to garbage data being processed, which can manifest as nonsensical results (often huge numbers).这会导致垃圾数据被处理,这可能表现为无意义的结果(通常是巨大的数字)。

    (If your standard library conforms to C++11, things are a bit different now: A failed >> now sets numeric variables to 0 instead of leaving them uninitialized (except for char s).) (如果您的标准库符合 C++11,现在情况会有所不同:失败的>>现在将数字变量设置为0 ,而不是让它们未初始化(除了char s)。)

  • If the stream is not empty, the loop will run again after the last valid input.如果流不为空,则循环将在最后一个有效输入之后再次运行。 Since in the last iteration all >> operations fail, variables are likely to keep their value from the previous iteration.由于在最后一次迭代中,所有>>操作都失败了,因此变量很可能会保留上一次迭代的值。 This can manifest as "the last line is printed twice" or "the last input record is processed twice".这可以表现为“最后一行被打印两次”或“最后输入记录被处理两次”。

    (This should manifest a bit differently since C++11 (see above): Now you get a "phantom record" of zeroes instead of a repeated last line.) (这应该与 C++11 (见上文)有所不同:现在你得到一个零的“幻像记录”而不是重复的最后一行。)

  • If the stream contains malformed data but you only check for .eof , you end up with an infinite loop.如果流包含格式错误的数据,但您只检查.eof ,则最终会出现无限循环。 >> will fail to extract any data from the stream, so the loop spins in place without ever reaching the end. >>将无法从流中提取任何数据,因此循环在原地旋转而不会到达终点。


To recap: The solution is to test the success of the >> operation itself, not to use a separate .eof() method: while (stream >> n >> m) { ... } , just as in C you test the success of the scanf call itself: while (scanf("%d%d", &n, &m) == 2) { ... } .回顾一下:解决方案是测试>>操作本身的成功,而不是使用单独的.eof()方法: while (stream >> n >> m) { ... } ,就像在 C 中测试一样scanf调用本身的成功: while (scanf("%d%d", &n, &m) == 2) { ... }

The important thing to remember is that, inFile.eof() doesn't become True until after an attempted read fails, because you've reached the end of the file.要记住的重要一点是, inFile.eof()直到尝试读取失败才会变为True ,因为您已经到达文件末尾。 So, in this example, you'll get an error.所以,在这个例子中,你会得到一个错误。

while (!inFile.eof()){
    inFile >> x;
        process(x);
}

The way to make this loop correct, is to combine reading and checking into a single operation, like so使这个循环正确的方法是将读取和检查结合到一个操作中,就像这样

while (inFile >> x) 
    process(x); 

By convention, operator>> returns the stream we read from, and a Boolean test on a stream returns False when the stream fails (such as reaching end of file).按照惯例, operator>>返回我们从中读取的流,当流失败(例如到达文件末尾)时,对流的布尔测试返回False

So this gives us the correct sequence :所以这给了我们正确的顺序:

  • read
  • test whether the read succeeds测试读取是否成功
  • if and only if the test succeeds, process what we've read当且仅当测试成功时,处理我们读到的内容

If you happen to encounter some other problem that prevents you from reading from the file correctly, you will not be able to reach eof() as such.如果您碰巧遇到了其他问题,阻止您正确读取文件,您将无法访问eof() For example, let's look at something like this例如,让我们看一下这样的事情

int x; 
while (!inFile.eof()) { 
    inFile >> x; 
    process(x);
} 
    

Let us trace through the working of the above code, with an example让我们通过一个例子来追溯上述代码的工作原理

  • Assume the contents of the file are '1', '2', '3', 'a', 'b' .假设文件的内容是'1', '2', '3', 'a', 'b'
  • The loop will read the 1, 2, and 3 correctly.循环将正确读取 1、2 和 3。
  • Then it'll get to a .然后它会到达a .
  • When it tries to extract a as an int, it'll fail.当它试图将a提取为 int 时,它会失败。
  • The stream is now in a failed state, until or unless we clear the stream, all attempts at reading from it will fail.流现在处于失败状态,直到或除非我们clear流,否则所有读取它的尝试都将失败。
  • But, when we test for eof(), it'll return False , because we're not at the end of the file, because there's still a waiting to be read.但是,当我们测试 eof() 时,它会返回False ,因为我们不在文件的末尾,因为a等待读取。
  • The loop will keep trying to read from the file, and fail every time, so it never reaches the end of the file.循环将继续尝试从文件中读取,并且每次都失败,因此它永远不会到达文件末尾。
  • So, the loop above will run forever.因此,上面的循环将永远运行。

But, if we use a loop like this, we will get the required output.但是,如果我们使用这样的循环,我们将获得所需的输出。

while (inFile >> x)
    process(x);

In this case, the stream will convert to False not only in case of end of file, but also in case of a failed conversion, such as the a that we can't read as an integer.在这种情况下,流不仅会在文件结束的情况下转换为False ,而且在转换失败的情况下也会转换为 False,例如我们无法读取为整数的a

The iostream::eof in a loop is considered as wrong because we haven't reached the EOF.循环中的 iostream::eof 被认为是错误的,因为我们还没有达到 EOF。 So it does not mean that the next read will succeed.所以这并不意味着下一次读取会成功。

I'll explain my statement by two sample codes, that will definitely help you understanding the concept in better manners.我将通过两个示例代码来解释我的陈述,这肯定会帮助您以更好的方式理解这个概念。 Let's say, when we want to read a file using file streams in C++.比方说,当我们想在 C++ 中使用文件流读取文件时。 And when we use a loop to write in a file, if we check the end of file using stream.eof(), we are actually checking whether the file has reached end or not.当我们使用循环写入文件时,如果我们使用stream.eof()检查文件的结尾,我们实际上是在检查文件是否已经到达结尾。

Example Code<\/strong>示例代码<\/strong>

#include<iostream>
#include<fstream>
using namespace std;
int main() {
   ifstream myFile("myfile.txt");
   string x;
   while(!myFile.eof()) {
      myFile >> x;
     // Need to check again if x is valid or eof
     if(x) {
        // Do something with x
     }
   }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM