简体   繁体   中英

Why MySQL still uses fsync() to flush the data when the option is O_DIRECT?

http://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_method

Based on the article description above, if we choose the option O_DIRECT, it describes as below:

O_DIRECT: InnoDB uses O_DIRECT (or directio() on Solaris) to open the data files, and uses fsync() to flush both the data and log files.

As option O_DIRECT means no\minimizing data would be cached in the OS page cache, however fsync() is used to flush the data from the page cache to device, so my question is why MySQL still use fsync() to flush both the data when the option is O_DIRECT?

Actually, the explanation is added in the documentation you linked in the paragraph following O_DIRECT option's description (highlighting is mine):

O_DIRECT_NO_FSYNC: InnoDB uses O_DIRECT during flushing I/O, but skips the fsync() system call afterward. This setting is suitable for some types of file systems but not others. For example, it is not suitable for XFS. If you are not sure whether the file system you use requires an fsync(), for example to preserve all file metadata, use O_DIRECT instead. This option was introduced in MySQL 5.6.7 (Bug #11754304, Bug #45892).

MySQL bug #45892 contains additional information:

Some testing by Domas has shown that some filesystems (XFS) do not sync metadata without the fsync. If the metadata would change, then you need to still use fsync (or O_SYNC for file open).

For example, if a file grows while O_DIRECT is enabled it will still write to the new part of the file, however since the metadata doesn't reflect the new size of the file the tail portion can be lost in the event of a crash.

Solution:

Continue to use fsync when important metadata changes or use O_SYNC in addition to O_DIRECT.

To sum it up: not using fsync() with certain file systems would cause MySQL to fail. However, MySQL offers the option from v5.6.7 to configure MySQL (well, innodb) tailored to your own OS' capabilities in this aspect by adding O_DIRECT_NO_FSYNC option.

O_DIRECT skips OS cache but it does not ensure that data is persisted on disk. O_DIRECT writes only to drive write cache. Once drive write cache is disabled the rate falls down to fsync level. O_DIRECT could be a good option if drive write is crash safe (backed by a battery).

Check this blog for a very thorough analysis

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM