简体   繁体   English

为什么在读取时发现 eof 时会设置故障位?

[英]Why is failbit set when eof is found on read?

I've read that <fstream> predates <exception> .我读过<fstream>早于<exception> Ignoring the fact that exceptions on fstream aren't very informative, I have the following question:忽略fstream上的异常信息不是很丰富的事实,我有以下问题:

It's possible to enable exceptions on file streams using the exceptions() method.可以使用exceptions()方法在文件流上启用异常。

ifstream stream;
stream.exceptions(ifstream::failbit | ifstream::badbit);
stream.open(filename.c_str(), ios::binary);

Any attempt to open a nonexistent file, a file without the correct permissions, or any other I/O problem will results in exception.任何尝试打开不存在的文件、没有正确权限的文件或任何其他 I/O 问题都将导致异常。 This is very good using an assertive programming style.使用自信的编程风格非常好。 The file was supposed to be there and be readable.该文件应该在那里并且是可读的。 If the conditions aren't met, we get an exception.如果条件不满足,我们会得到一个异常。 If I wasn't sure whether the file could safely be opened, I could use other functions to test for it.如果我不确定文件是否可以安全打开,我可以使用其他函数来测试它。

But now suppose I try to read into a buffer, like this:但现在假设我尝试读入缓冲区,如下所示:

char buffer[10];
stream.read(buffer, sizeof(buffer)); 

If the stream detects the end-of-file before filling the buffer, the stream decides to set the failbit , and an exception is fired if they were enabled.如果 stream 在填充缓冲区之前检测到文件结尾,则 stream 决定设置failbit ,如果它们被启用,则会引发异常。 Why?为什么? What's the point of this?这有什么意义? I could have verified that just testing eof() after the read:我本可以在阅读后验证仅测试eof()

char buffer[10];
stream.read(buffer, sizeof(buffer));
if (stream.eof()) // or stream.gcount() != sizeof(buffer)
    // handle eof myself

This design choice prevents me from using standard exceptions on streams and forces me to create my own exception handling on permissions or I/O errors.这种设计选择阻止我在流上使用标准异常,并迫使我创建自己的权限或 I/O 错误异常处理。 Or am I missing something?还是我错过了什么? Is there any way out?有什么出路吗? For example, can I easily test if I can read sizeof(buffer) bytes on the stream before doing so?例如,在这样做之前,我是否可以轻松测试是否可以读取 stream 上的sizeof(buffer)字节?

The failbit is designed to allow the stream to report that some operation failed to complete successfully.故障位旨在允许 stream 报告某些操作未能成功完成。 This includes errors such as failing to open the file, trying to read data that doesn't exist, and trying to read data of the wrong type.这包括诸如无法打开文件、尝试读取不存在的数据以及尝试读取错误类型的数据等错误。

The particular case you're asking about is reprinted here:您要询问的特定案例在此处转载:

char buffer[10];
stream.read(buffer, sizeof(buffer)); 

Your question is why failbit is set when the end-of-file is reached before all of the input is read.您的问题是为什么在读取所有输入之前到达文件结尾时设置了失败位。 The reason is that this means that the read operation failed - you asked to read 10 characters, but there weren't sufficiently many characters in the file.原因是这意味着读取操作失败 - 您要求读取 10 个字符,但文件中没有足够多的字符。 Consequently, the operation did not complete successfully, and the stream signals failbit to let you know this, even though the available characters will be read.因此,操作没有成功完成,并且 stream 发出失败位信号让您知道这一点,即使将读取可用字符。

If you want to do a read operation where you want to read up to some number of characters, you can use the readsome member function:如果要执行读取操作,最多可以读取一些字符,可以使用readsome成员 function:

char buffer[10];
streamsize numRead = stream.readsome(buffer, sizeof(buffer)); 

This function will read characters up to the end of the file, but unlike read it doesn't set failbit if the end of the file is reached before the characters are read.此 function 将读取字符直到文件末尾,但与read不同,如果在读取字符之前到达文件末尾,它不会设置故障位。 In other words, it says "try to read this many characters, but it's not an error if you can't. Just let me know how much you read."换句话说,它说“试着读这么多字符,但如果你不能读也不是错误。让我知道你读了多少。” This contrasts with read , which says "I want precisely this many characters, and it's an error if you can't do it."这与read形成对比,后者说“我想要这么多字符,如果你做不到,那就是错误。”

EDIT : An important detail I forgot to mention is that eofbit can be set without triggering failbit.编辑:我忘了提到的一个重要细节是可以设置 eofbit 而不触发故障位。 For example, suppose that I have a text file that contains the text例如,假设我有一个包含文本的文本文件

137

without any newlines or trailing whitespace afterwards.之后没有任何换行符或尾随空格。 If I write this code:如果我写这段代码:

ifstream input("myfile.txt");

int value;
input >> value;

Then at this point input.eof() will return true, because when reading the characters from the file the stream hit the end of the file trying to see if there were any other characters in the stream.然后此时input.eof()将返回 true,因为当从文件中读取字符时,stream 会碰到文件末尾,试图查看 stream 中是否还有其他字符。 However, input.fail() will not return true, because the operation succeeded - we can indeed read an integer from the file.但是, input.fail()不会返回 true,因为操作成功了——我们确实可以从文件中读取 integer。

Hope this helps!希望这可以帮助!

Using the underlying buffer directly seems to do the trick:直接使用底层缓冲区似乎可以解决问题:

char buffer[10];
streamsize num_read = stream.rdbuf()->sgetn(buffer, sizeof(buffer));

Improving @absence's answer, it follows a method readeof() that does the same of read() but doesn't set failbit on EOF.改进@absence 的答案,它遵循readeof()方法,该方法与read()相同,但不在EOF 上设置failbit。 Also real read failures have been tested, like an interrupted transfer by hard removal of a USB stick or link drop in a network share access.还测试了真正的读取失败,例如硬移除 USB 存储棒或网络共享访问中的链接丢失导致传输中断。 It has been tested on Windows 7 with VS2010 and VS2013 and on linux with gcc 4.8.1.它已经在 Windows 7 和 VS2010 和 VS2013 以及 linux 和 gcc 4.8.1 上进行了测试。 On linux only USB stick removal has been tried.在 linux 上,仅尝试了 USB 棒移除。

#include <iostream>
#include <fstream>
#include <stdexcept>

using namespace std;

streamsize readeof(istream &stream, char *buffer, streamsize count)
{
    if (count == 0 || stream.eof())
        return 0;

    streamsize offset = 0;
    streamsize reads;
    do
    {
        // This consistently fails on gcc (linux) 4.8.1 with failbit set on read
        // failure. This apparently never fails on VS2010 and VS2013 (Windows 7)
        reads = stream.rdbuf()->sgetn(buffer + offset, count);

        // This rarely sets failbit on VS2010 and VS2013 (Windows 7) on read
        // failure of the previous sgetn()
        (void)stream.rdstate();

        // On gcc (linux) 4.8.1 and VS2010/VS2013 (Windows 7) this consistently
        // sets eofbit when stream is EOF for the conseguences  of sgetn(). It
        // should also throw if exceptions are set, or return on the contrary,
        // and previous rdstate() restored a failbit on Windows. On Windows most
        // of the times it sets eofbit even on real read failure
        (void)stream.peek();

        if (stream.fail())
            throw runtime_error("Stream I/O error while reading");

        offset += reads;
        count -= reads;
    } while (count != 0 && !stream.eof());

    return offset;
}

#define BIGGER_BUFFER_SIZE 200000000

int main(int argc, char* argv[])
{
    ifstream stream;
    stream.exceptions(ifstream::badbit | ifstream::failbit);
    stream.open("<big file on usb stick>", ios::binary);

    char *buffer = new char[BIGGER_BUFFER_SIZE];

    streamsize reads = readeof(stream, buffer, BIGGER_BUFFER_SIZE);

    if (stream.eof())
        cout << "eof" << endl << flush;

    delete buffer;

    return 0;
}

Bottom line: on linux the behavior is more consistent and meaningful.底线:在 linux 上,行为更加一致和有意义。 With exceptions enabled on real read failures it will throw on sgetn() .如果在实际读取失败时启用异常,它将在sgetn()上抛出。 On the contrary Windows will treat read failures as EOF most of the times.相反,Windows 大多数时候会将读取失败视为 EOF。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM