[英]StandardOpenOption.SYNC vs StandardOpenOption.DSYNC
Gili, 吉莉
DSYNC is a subset of SYNC. DSYNC是SYNC的子集。
SYNC requires that all data (file data and file metadata managed by the filesystem) get written out synchronously while DSYNC requires that only the file data get written out synchronously. SYNC要求同步写入所有数据(由文件系统管理的文件数据和文件元数据),而DSYNC要求仅同步写入文件数据。 As for the overhead, I think that is a giant "it depends on the filesystem".
至于开销,我认为这是一个巨大的“取决于文件系统”。 Looking at modern filesystems using concepts like copy-on-write, shadow-copying, versioning, checksuming, etc... I imagine it could get expensive to try and block the entire write operation until all that work is done.
使用诸如写时复制,卷影复制,版本控制,校验和等概念查看现代文件系统,我想尝试阻塞整个写入操作直到完成所有工作可能会变得很昂贵。
Potential for data loss is a more confusing answer to provide; 数据丢失的可能性是一个更令人困惑的答案。 the advantages of asynchronous file I/O is that the underlying filesystem or disk can actually batch or order writes to avoid random I/O and structure the writes in a more sequential manner.
异步文件I / O的优点是底层文件系统或磁盘实际上可以批处理或排序写操作,以避免随机I / O并以更顺序的方式构造写操作。
That is great, but to answer your question of data loss, it would be any pending writes that are sitting in the cache before a flush that could be potentially lost. 很好,但是要回答您的数据丢失问题,可能是在刷新之前位于缓存中的所有挂起的写操作都有可能丢失。 In short, it is hard to say.
简而言之,很难说。
In short, the ordering looks like: 简而言之,排序如下:
I should say I am assuming all these questions pertain to the new AsynchronousFileChannel in Java 7; 我应该假设所有这些问题都与Java 7中的新AsynchronousFileChannel有关; my apologies if that isn't the case.
如果不是这种情况,我深表歉意。
According to the source for sun.nio.fs.UnixChannelFactory
, those options map respectively to the O_SYNC
and O_DSYNC
options of open()
, whose documentation says: 根据
sun.nio.fs.UnixChannelFactory
的来源,这些选项分别映射到open()
的O_SYNC
和O_DSYNC
选项,其文档显示 :
O_SYNC
-- like write(2)
followed by a call to fsync(2)
O_SYNC
类似于write(2)
然后调用fsync(2)
O_DSYNC
-- like write(2)
followed a call to fdatasync(2)
O_DSYNC
像write(2)
之后是对fdatasync(2)
的调用 The documentation for fdatasync(2)
then makes it explicit that anything insubstantial to the retrieval of the file's data, such as last access time and last modification time aren't flushed by O_DSYNC
, but anything that is, is: 然后,
fdatasync(2)
的文档明确指出, O_DSYNC
不会刷新所有与文件数据检索无关的内容,例如上次访问时间和上次修改时间 ,但实际上是:
fsync() transfers ("flushes") all modified in-core data of (ie, modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device (or other permanent storage device) so that all changed information can be retrieved even after the system crashed or was rebooted.
fsync()将文件描述符fd所引用的文件的所有修改后的内核内数据(即,针对该文件的修改后的缓冲区高速缓存页)传输(“刷新”)到磁盘设备(或其他永久存储设备),以便所有更改后的信息甚至在系统崩溃或重新启动后也可以检索。 ... The call blocks until the device reports that the transfer has completed.
...呼叫阻塞,直到设备报告传输已完成。 It also flushes metadata information associated with the file (see stat(2)).
它还刷新与文件关联的元数据信息(请参阅stat(2))。
fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled.
fdatasync()与fsync()类似,但是不会刷新已修改的元数据,除非需要该元数据才能正确处理后续数据检索。 For example, changes to st_atime or st_mtime (respectively, time of last access and time of last modification; see stat(2)) do not require flushing because they are not necessary for a subsequent data read to be handled correctly.
例如,对st_atime或st_mtime的更改(分别是上次访问的时间和上次修改的时间;请参见stat(2))不需要刷新,因为它们对于正确处理后续的读取数据不是必需的。 On the other hand, a change to the file size (st_size, as made by say ftruncate(2)), would require a metadata flush.
另一方面,文件大小的更改(如ftruncate(2)所示,为st_size)将需要刷新元数据。
So an educated guess is, unless a program uses these not-important-for-data-attributes (and synchronization of their values are important) StandardOpenOption.DSYNC
is acceptable. 因此,有根据的猜测是,除非程序使用这些不重要的数据属性(并且其值的同步很重要),否则
StandardOpenOption.DSYNC
是可以接受的。 (Although I'm not sure how much performance benefits there are in practice in choosing DSYNC
over SYNC
.) (尽管我不确定在实践中选择
DSYNC
不是SYNC
有多少性能优势。)
Looking through BasicFileAttributes
, fields such as creationTime()
, lastModifiedTime()
, lastAccessTime()
probably fall into this "not important for data access" category, while fields such as isDirectory()
, isRegularFile()
, is*()
and size()
probably won't, as I can't imagine data being accessible if they are wrong. 从
BasicFileAttributes
,诸如creationTime()
, lastModifiedTime()
, lastAccessTime()
类的字段可能属于“对数据访问不重要”类别,而诸如isDirectory()
, isRegularFile()
, is*()
和size()
可能不会,因为我无法想象如果数据错误,就可以访问数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.