简体   繁体   English

从文件读取值时附加到输出的垃圾

[英]Garbage Appended to Output When Reading Values from a File

I'm new to C++ file io, so the other day I decided to write a small program that simply reads a UTF-8 encoded string and a paired float from a binary file. 我是C ++文件io的新手,所以前几天,我决定编写一个小程序,该程序仅从二进制文件读取UTF-8编码的字符串和成对的float。 The pattern is string-float with no extra data or spacing between pairs. 该模式是字符串浮动的,在对之间没有额外的数据或间隔。 EDIT I've revised the code based on several answers. 编辑我已经根据几个答案修改了代码。 However, the output remains the same ("The RoommateAp 0"); 但是,输出保持不变(“ The RoommateAp 0”);

string readString (ifstream* file)
{
    //Get the length of the upcoming string
    uint16_t stringSize = 0;
    file->read(reinterpret_cast<char*>(&stringSize), sizeof(char) * 2);

    //Now that we know how long buffer should be, initialize it
    char* buffer = new char[stringSize + 1];
    buffer[stringSize] = '\0';

    //Read in a number of chars equal to stringSize
    file->read(buffer, stringSize);
    //Build a string out of the data
    string result = buffer;

    delete[] buffer;
    return result;
}

float readFloat (ifstream* file)
{
    float buffer = 0;
    file->read(reinterpret_cast<char*>(&buffer), sizeof(float));
    return buffer;
}

int main()
{
    //Create new file that's open for reading
    ifstream file("movies.dat", ios::in|ios::binary);
    //Make sure the file is open before starting to read
    if (file.is_open())
    {
        while (!file.eof())
        {
            cout << readString(&file) << endl;
            cout << readFloat(&file)  << endl;
        }
        file.close();
    }
    else
    {
        cout << "Unable to open file" << endl;
    }
}

And a sample of data from the file (spaces for readability): 以及来自文件的数据样本(可读性空间):

000C 54686520526F6F6D6D617465 41700000

As one can see, the first two bytes are the length of the string (12 in this case), followed by twelve characters (which spell "The Roommate"), and the final four bytes are a float. 可以看到,前两个字节是字符串的长度(在这种情况下为12),后跟十二个字符(拼写为“室友”),最后四个字节是浮点数。

When I run this code, the only thing that happens is that the terminal hangs and I have to close it manually. 当我运行此代码时,唯一发生的是终端挂起,我必须手动关闭它。 I think it may be because I am reading past the end of the file, but I have no idea why this would happen. 我认为可能是因为我正在阅读文件末尾的内容,但我不知道为什么会发生这种情况。 What am I doing incorrectly? 我做错了什么?

There are at least two issues. 至少有两个问题。 First, the line: 一,行:

file->read(reinterpret_cast<char*>(stringSize), sizeof(char) * 2);

Probably should take the address of stringSize : 可能应该采用stringSize的地址:

file->read(reinterpret_cast<char*>(&stringSize), sizeof(stringSize));

Second, the line: 二,行:

char* buffer = new char[stringSize];

Doesn't allocate enough memory, since it doesn't take the NUL terminator into account. 没有分配足够的内存,因为它没有考虑NUL终止符。 That code should do something like: 该代码应执行以下操作:

//Now that we know how long buffer should be, initialize it
char* buffer = new char[stringSize + 1];
//Read in a number of chars equal to stringSize
file->read(buffer, stringSize);
buffer[stringSize] = '\0';

Finally, the line: 最后一行:

return static_cast<string>(buffer);

Fails to delete[] the buffer after instantiating a string from it, which will cause a memory leak. 实例化缓冲区中的string后,无法delete[]缓冲区,这将导致内存泄漏。

Also note that std::string 's UTF-8 support is quite poor out of the box. 还要注意, std::string的UTF-8支持非常差。 Fortunately, there are solutions . 幸运的是,有解决方案

Your code has a few serious problems: 您的代码有一些严重的问题:

  1. file->read(reinterpret_cast<char*>(stringSize), sizeof(char) * 2);

    in this part you are casting the current value of stringSize into a pointer. 在这一部分中,您正在将stringSize当前值转换为指针。 Your idea was most probably instead to pass the address of the stringSize variable. 您的想法很可能是传递stringSize变量的地址。

  2. char* buffer = new char[stringSize];

    This allocates an array of chars, but no one is going to free it. 这会分配一个字符数组,但是没有人会释放它。 Use std::vector<char> buffer(stringSize); 使用std::vector<char> buffer(stringSize); instead so that the memory management will be done correctly for you. 而是可以为您正确进行内存管理。 To get the address of the buffer you can use &buffer[0] . 要获取缓冲区的地址,可以使用&buffer[0]

  3. return static_cast<string>(buffer);

    What you probably want is the string constructor that accepts pointers to first and one-past-last characters. 您可能想要的是一个字符串构造函数,它接受指向第一个和最后一个字符的指针。 In other words: return std::string(&buffer[0], &buffer[0]+stringSize); 换句话说: return std::string(&buffer[0], &buffer[0]+stringSize);

You cast to a character array. 您强制转换为字符数组。 Does the string size number in the file include space for the null character? 文件中的字符串大小数字是否包含空字符的空间? If not, you might be running off the end of a character array and looping forever on the cout write. 否则,您可能正在运行一个字符数组的末尾,并在cout写入中永远循环。

There are some issues with your code. 您的代码存在一些问题。

You state that the size of the string is specified by 4 bytes, that means you should use a uint32_t not a uint16_t. 您声明该字符串的大小由4个字节指定,这意味着您应该使用uint32_t而不是uint16_t。

You do not free the memory used when you allocate the string buffer. 您不会释放分配字符串缓冲区时使用的内存。

string readString(std::ifstream* file)
{
    // Get the length of the upcoming string.
    // The length of the string is specified with 4 bytes: use uint32_t, not uint16_t
    uint32_t stringSize = 0;    
    file->read(reinterpret_cast<char*>(&stringSize), sizeof(uint32_t));

    // Now that we know how long buffer should be, initialize it
    char* buffer = new char[stringSize + 1];
    buffer[stringSize] = '\0'; // null terminate the string

    //Read in a number of chars equal to stringSize
    file->read(buffer, stringSize);
    string result = buffer;

    delete[] buffer;
    return result;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM