简体   繁体   English

一次读取一个二进制文件1个字节

[英]Reading a binary file 1 byte at a time

I am trying to read a binary file in C 1 byte at a time and after searching the internet for hours I still can not get it to retrieve anything but garbage and/or a seg fault. 我试图一次读取C 1字节的二进制文件,并在搜索互联网几个小时后仍然无法检索除垃圾和/或seg故障之外的任何内容。 Basically the binary file is in the format of a list that is 256 items long and each item is 1 byte (an unsigned int between 0 and 255). 基本上,二进制文件的格式为256项长度,每个项目为1个字节(0到255之间的无符号整数)。 I am trying to use fseek and fread to jump to the "index" within the binary file and retrieve that value. 我试图使用fseek和fread跳转到二进制文件中的“索引”并检索该值。 The code that I have currently: 我目前的代码:

unsigned int buffer;

int index = 3; // any index value

size_t indexOffset = 256 * index;
fseek(file, indexOffset, SEEK_SET);
fread(&buffer, 256, 1, file);

printf("%d\n", buffer);

Right now this code is giving me random garbage numbers and seg faulting. 现在这段代码给了我随机的垃圾数字和段错误。 Any tips as to how I can get this to work right? 有关如何使其正常工作的任何提示?

In your code you are trying to read 256 bytes to the address of one int. 在您的代码中,您尝试将256个字节读取到一个int的地址。 If you want to read one byte at a time, call fread(&buffer, 1, 1, file); 如果你想一次读取一个字节,请调用fread(&buffer, 1, 1, file); (See fread ). (见fread )。

But a simpler solution will be to declare an array of bytes, read it all together and process it after that. 但更简单的解决方案是声明一个字节数组,一起读取并在此之后处理它。

Your confusing bytes with int . 你的令人困惑的字节int The common term for a byte is an unsigned char . 字节的常用术语是unsigned char Most bytes are 8-bits wide. 大多数字节都是8位宽。 If the data you are reading is 8 bits, you will need to read in 8 bits: 如果您正在读取的数据是8位,则需要读取8位:

#define BUFFER_SIZE 256

unsigned char buffer[BUFFER_SIZE];

/* Read in 256 8-bit numbers into the buffer */
size_t bytes_read = 0;
bytes_read = fread(buffer, sizeof(unsigned char), BUFFER_SIZE, file_ptr);
// Note: sizeof(unsigned char) is for emphasis

The reason for reading all the data into memory is to keep the I/O flowing. 将所有数据读入存储器的原因是为了保持I / O流动。 There is an overhead associated with each input request, regardless of the quantity requested. 无论请求的数量如何,每个输入请求都会产生开销。 Reading one byte at a time, or seeking to one position at a time is the worst case. 一次读一个字节,或一次寻找一个位置是最坏的情况。

Here is an example of the overhead required for reading 1 byte: 以下是读取1个字节所需的开销示例:

Tell OS to read from the file.
OS searches to find the file location.
OS tells disk drive to power up.
OS waits for disk drive to get up to speed.
OS tells disk drive to position to the correct track and sector.
-->OS tells disk to read one byte and put into drive buffer.
OS fetches data from drive buffer.
Disk spins down to a stop.
OS returns 1 byte to your program.

In your program design, the above steps will be repeated 256 times. 在您的程序设计中,上述步骤将重复256次。 With everybody's suggestion, the line marked with "-->" will read 256 bytes. 根据每个人的建议,标有“ - >”的行将读取256个字节。 Thus the overhead is executed only once instead of 256 times to get the same quantity of data. 因此,开销仅执行一次而不是256次以获得相同数量的数据。

unsigned char buffer; // note: 1 byte
fread(&buffer, 1, 1, file);

It is time to read mans I believe. 现在是时候阅读我相信的人。

You are trying to read 256 bytes into a 4-byte integer variable called "buffer". 您正在尝试将256个字节读入一个名为“buffer”的4字节整数变量中。 You are overwriting the next 252 bytes of other data. 您正在覆盖其他252个字节的其他数据。

It seems like buffer should either be unsigned char buffer[256]; 似乎buffer应该是unsigned char buffer[256]; or you should be doing fread(&buffer, 1, 1, f) and in that case buffer should be unsigned char buffer; 或者你应该做fread(&buffer, 1, 1, f) ,在这种情况下buffer应该是unsigned char buffer; .

Alternatively, if you just want a single character, you could just leave buffer as int ( unsigned is not needed because C99 guarantees a reasonable minimum range for plain int) and simply say: 或者,如果你只想要一个字符,你可以将buffer保留为int (不需要unsigned ,因为C99保证了普通int的合理最小范围)并简单地说:

buffer = fgetc(f);

Couple of problems with the code as it stands. 几个问题与代码,因为它站立。

The prototype for fread is: fread的原型是:

size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);

You've set the size to 256 (bytes) and the count to 1. That's fine, that means "read one lump of 256 bytes, shove it into the buffer". 您已将大小设置为256(字节)并将计数设置为1.这很好,这意味着“读取一个256字节的块,将其推入缓冲区”。

However, your buffer is on the order of 2-8 bytes long (or, at least, vastly smaller than 256 bytes), so you have a buffer overrun. 但是,您的缓冲区大约为2-8个字节(或者,至少远小于256个字节),因此您有一个缓冲区溢出。 You probably want to use fred(&buffer, 1, 1, file). 您可能想要使用fred(&buffer,1,1,file)。

Furthermore, you're writing byte data to an int pointer. 此外,您正在将字节数据写入int指针。 This will work on one endian-ness (small-endian, in fact), so you'll be fine on Intel architecture and from that learn bad habits tha WILL come back and bite you, one of these days. 这将适用于一个endian-ness(实际上是小端),因此你可以在英特尔架构上做得很好,并从中学习坏习惯,这些日子会回来咬你。

Try real hard to only write byte data into byte-organised storage, rather than into ints or floats. 尝试很难将字节数据写入字节组织存储,而不是写入整数或浮点数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM