简体   繁体   English

h5py:在关闭()文件之前是否需要flush()?

[英]h5py: Do I need to flush() before I close() a file?

The title contains the question: In the Python HDF5 library h5py , do I need to flush() a file before I close() it? 标题包含以下问题:在Python HDF5库h5py ,我需要在close()之前flush()文件吗?

Or does closing the file already make sure that any data that might still be in the buffers will be written to disk? 或者关闭文件是否已确保可能仍在缓冲区中的任何数据都将写入磁盘?

What exactly is the point of flushing? 什么是冲洗点? When would flushing be necessary? 什么时候需要冲洗?

No, you do not need to flush the file before closing. 不,您不需要在关闭之前刷新文件。 Flushing is done automatically by the underlying HDF5 C library when you close the file. 关闭文件时,底层HDF5 C库会自动刷新。


As to the point of flushing. 至于冲洗点。 File I/O is slow compared to things like memory or cache access. 与内存或缓存访问相比,文件I / O很慢。 If programs had to wait before data was actually on the disk each time a write was performed, that would slow things down a lot. 如果程序必须等到每次执行写入时数据实际上都在磁盘上 ,那么这将减慢很多事情。 So the actual writing to disk is buffered by at least the OS, but in many cases by the I/O library being used (eg, the C standard I/O library). 因此,实际写入磁盘至少由OS缓冲,但在许多情况下由所使用的I / O库(例如,C标准I / O库)缓冲。 When you ask to write data to a file, it usually just means that the OS has copied your data to its own internal buffer, and will actually put it on the disk when it's convenient to do so. 当您要求将数据写入文件时,通常只是意味着操作系统已将数据复制到其自己的内部缓冲区,并且在方便时将其实际放在磁盘上。

Flushing overrides this buffering, at whatever level the call is made. 无论呼叫是什么级别,刷新都会覆盖此缓冲。 So calling h5py.File.flush() will flush the HDF5 library buffers, but not necessarily the OS buffers. 所以调用h5py.File.flush()将刷新HDF5库缓冲区,但不一定是OS缓冲区。 The point of this is to give the program some control over when data actually leaves a buffer. 这样做的目的是让程序控制数据何时实际离开缓冲区。

For example, writing to the standard output is usually line-buffered. 例如,写入标准输出通常是行缓冲的。 But if you really want to see the output before a newline, you can call fflush(stdout) . 但是如果你真的想在换行前看到输出,你可以调用fflush(stdout) This might make sense if you are piping the standard output of one process into another: that downstream process can start consuming the input right away, without waiting for the OS to decide it's a good time. 如果您将一个进程的标准输出传递给另一个进程,这可能是有意义的:下游进程可以立即开始消耗输入,而无需等待操作系统确定它是一个好时机。

Another good example is making a call to fork(2) . 另一个很好的例子是调用fork(2) This usually copies the entire address space of a process, which means the I/O buffers as well. 这通常会复制进程的整个地址空间,这也意味着I / O缓冲区。 That may result in duplicated output, unnecessary copying, etc. Flushing a stream guarantees that the buffer is empty before forking. 这可能导致重复输出,不必要的复制等。刷新流可确保在分叉之前缓冲区为空。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 关闭打开的h5py数据文件 - Close an open h5py data file 关闭文件后如何将 h5py 组保留在内存中? - How do I keep an h5py group in memory after closing the file? 如何使用 H5PY 增加 HDF5 文件中数据集的维度大小? - How do I increase the dimension size of a dataset in a HDF5 file using H5PY? numpy,h5py:如何从使用h5py保存的较大列中按其列之一排序的数组? - numpy, h5py: How do I make an array sorted by one of its columns from a bigger one saved with h5py? h5py 文件上下文管理器会自动关闭文件吗? - Does the h5py file context manager automatically close files? 如何使用 h5py 将值并行添加到现有的 HDF5 文件中,其中包含 3 个组和每个组中的 12 个数据集? - How do I add values in parallel to an existing HDF5 file with 3 groups and 12 datasets in each group using h5py? 如何使用 h5py 读取 Row Wise 而不是 column wise? - How do I read Row Wise instead of column wise with h5py? 如何在h5py中为HDF5数据集指定比例(或物理尺寸)? - How do I assign scales (or physical dimensions) to HDF5 datasets in h5py? 将数据写入SSD磁盘上的h5py似乎很慢:我该怎么办才能加快速度 - Writing Data to h5py on SSD disk appears slow: What can I do to speed it up 如何更新h5py中的阵列? - How can I update arrays in h5py?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM