简体   繁体   English

在 C 中使用 fseek 比使用 fread 序列有什么好处?

[英]What is the advange of using fseek over using a sequence of fread in C?

I'm a beginner in C programming and I have some questions regarding how to deal with files.我是 C 编程的初学者,我对如何处理文件有一些疑问。

Let us suppose that we have a binary file with N int values stored.让我们假设我们有一个存储了 N 个 int 值的二进制文件。 Let us suppose that we what to read the i-th in value in the file.让我们假设我们要读取文件中的第 i 个值。

Is there any real advantage of using fseek for positioning the file pointer to the i-th int value and reading it after the fseek instead of using a sequence of i fread calls?使用 fseek 将文件指针定位到第 i 个 int 值并在 fseek 之后读取它而不是使用 i fread 调用序列是否有任何真正的优势?

Intuitively, I think that fseek is faster.直觉上,我认为 fseek 更快。 But how the function finds the i-th value in the file without reading the intermediary information?但是函数如何在不读取中间信息的情况下找到文件中的第 i 个值呢?

I think that this is implementation-dependent.我认为这是依赖于实现的。 So, I tried to find the implementation of fseek function, without much success.所以,我试图找到 fseek 函数的实现,但没有取得多大成功。

But how the function finds the i-th value in the file without reading the intermediary information?但是函数如何在不读取中间信息的情况下找到文件中的第 i 个值呢?

It doesn't.它没有。 It's up to you provide the correct (absolute or relative) offset.由您提供正确的(绝对或相对)偏移量。 You can request, for example, to advance the file pointer by i*sizeof(X) .例如,您可以请求将文件指针前移i*sizeof(X)

It still needs to follow the chain of sectors in which the file is located to find the right one, but that doesn't require reading those sectors.它仍然需要遵循文件所在的扇区链来找到正确的扇区,但这不需要读取这些扇区。 That metadata is stored outside of the file itself.该元数据存储在文件本身之外。

Is there any real advantage of using fseek for positioning the file pointer to the i-th int value and reading it after the fseek instead of using a sequence of i fread calls?使用 fseek 将文件指针定位到第 i 个 int 值并在 fseek 之后读取它而不是使用 i fread 调用序列是否有任何真正的优势?

There are potential benefits at every level.每个级别都有潜在的好处。

By seeking, the system may have to read less from the disk.通过查找,系统可能不得不从磁盘中读取更少的内容。 The system reads from the disk in sectors, so short seeks might not have this benefit.系统以扇区为单位从磁盘读取,因此短寻道可能没有这个好处。 But seeking over entire sectors reduces the amount of data that needs to be fetched from the disk.但是在整个扇区中查找会减少需要从磁盘获取的数据量。

Similarly, by seeking, the stdio library my have to request less from the OS.同样,通过搜索,stdio 库对操作系统的请求更少。 The stdio library normally reads more than it requires so that future calls to fread doesn't need to touch the OS or the disk. stdio 库通常读取比它需要的更多的内容,以便将来对fread调用不需要接触操作系统或磁盘。 A short seek might not require making any system calls, but seeking beyond the end of the buffered data could reduce the total amount of data fetched from the OS.短查找可能不需要进行任何系统调用,但在缓冲数据末尾以外查找可以减少从操作系统获取的数据总量。

Finally, the skipped data doesn't need to be copied from the stdio library's buffers to the user's buffer at all when using fseek , no matter how far you seek.最后,在使用fseek ,无论您搜索多远,都不需要将跳过的数据从 stdio 库的缓冲区复制到用户的缓冲区。

Oh, and let's not forget that you were considering i -1 reads instead of just a large one.哦,我们不要忘记您正在考虑i -1 读取而不是大读取。 Each of those reads consume CPU, both in the library (error checking) and in the caller (error handling).这些读取中的每一个都消耗 CPU,无论是在库(错误检查)中还是在调用者中(错误处理)。

Is there any real advantage of using fseek for positioning the file pointer to the i-th int value and reading it after the fseek instead of using a sequence of i fread calls?使用 fseek 将文件指针定位到第 i 个 int 值并在 fseek 之后读取它而不是使用 i fread 调用序列有什么真正的好处吗?

Yes, if you want to read a value from the file and you know where it is, there is no reason to read anything else.是的,如果您想从文件中读取一个值并且您知道它在哪里,那么就没有理由读取任何其他内容。

Intuitively, I think that fseek is faster.直觉上,我认为 fseek 更快。 But how the function finds the i-th value in the file without reading the intermediary information?但是函数如何在不读取中间信息的情况下找到文件中的第 i 个值呢?

Your intuition is correct, if you read one value it stands to reason that the it will be more efficient than reading several values.你的直觉是正确的,如果你读取一个值,它会比读取多个值更有效。 The way it finds the value is simple, generally speaking each position in the file corresponds to 1 byte, if you pass an offset of, for example 7, the next read will start from the 8th byte, imagine your file has the following data:它找值的方式很简单,一般来说文件中的每个位置对应1个字节,如果你传递一个偏移量,比如7,下次读取将从第8个字节开始,假设你的文件有以下数据:

 -58 10 12  14 7 9
^      ^
|      |
0      offset of 7

fseek(fp, 7, SEEK_SET);

if(fscanf(fp,"%d",&num) == 1 ){  
    printf("%d", num);
}  

Will output 12 .将输出12

The file indicator was set to the 7th position, then the reading begins from the next byte.文件指示符设置为第 7 位,然后从下一个字节开始读取。 It's as if you had an array and you want to access the 7th position, you'll just use arr[7] .就好像您有一个数组并且想要访问第 7 个位置,您只需使用arr[7]

I think that this is implementation-dependent.我认为这是依赖于实现的。

Though there are some small details that can be implementation defined, the overall behavior is standardised.尽管有一些小细节可以实现定义,但整体行为是标准化的。

§7.21.9.2 The fseek function §7.21.9.2 fseek 函数

Synopsis概要

1. 1.

 #include <stdio.h> int fseek(FILE *stream, long int offset, int whence);

Description:描述:

  1. The fseek function sets the file position indicator for the stream pointed to by stream. fseek 函数为流指向的流设置文件位置指示符。 If a read or write error occurs, the error indicator for the stream is set and fseek fails.如果发生读取或写入错误,则会设置流的错误指示符并且 fseek 失败。

  2. For a binary stream, the new position, measured in characters from the beginning of the file, is obtained by adding offset to the position specified by whence.对于二进制流,新位置(以文件开头的字符为单位)是通过将偏移量添加到由 wherece 指定的位置获得的。 The specified position is the beginning of the file if whence is SEEK_SET, the current value of the file position indicator if SEEK_CUR, or end-of-file if SEEK_END.如果 wherece 是 SEEK_SET,则指定的位置是文件的开头,如果 SEEK_CUR,则是文件位置指示符的当前值,如果是 SEEK_END,则是文件结尾。 A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.二进制流不需要有意义地支持具有 SEEK_END 值的 fseek 调用。

  3. For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.对于文本流,偏移量应为零,或偏移量应为较早成功调用与同一文件相关联的流上的 ftell 函数返回的值,且 wherece 应为 SEEK_SET。

  4. After determining the new position, a successful call to the fseek function undoes any effects of the ungetc function on the stream, clears the end-of-file indicator for the stream, and then establishes the new position.确定新位置后,成功调用 fseek 函数会撤消 ungetc 函数对流的任何影响,清除流的文件结束指示符,然后建立新位置。 After a successful fseek call, the next operation on an update stream may be either input or output.在成功调用 fseek 之后,更新流上的下一个操作可能是输入或输出。

Returns:返回:

  1. The fseek function returns nonzero only for a request that cannot be satisfied. fseek 函数仅对无法满足的请求返回非零值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM