简体   繁体   中英

How do I create a sparse file programmatically, in C, on Mac OS X?

I'd like to create a sparse file such that all-zero blocks don't take up actual disk space until I write data to them. Is it possible?

There seems to be some confusion as to whether the default Mac OS X filesystem (HFS+) supports holes in files. The following program demonstrates that this is not the case.

#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>

void create_file_with_hole(void)
{
    int fd = open("file.hole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
    write(fd, "Hello", 5);
    lseek(fd, 99988, SEEK_CUR); // Make a hole
    write(fd, "Goodbye", 7);
    close(fd);
}

void create_file_without_hole(void)
{
    int fd = open("file.nohole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
    write(fd, "Hello", 5);
    char buf[99988];
    memset(buf, 'a', 99988);
    write(fd, buf, 99988); // Write lots of bytes
    write(fd, "Goodbye", 7);
    close(fd);
}

int main()
{
    create_file_with_hole();
    create_file_without_hole();
    return 0;
}

The program creates two files, each 100,000 bytes in length, one of which has a hole of 99,988 bytes.

On Mac OS X 10.5 on an HFS+ partition, both files take up the same number of disk blocks (200):

$ ls -ls
total 400
200 -rw-------  1 user  staff  100000 Oct 10 13:48 file.hole
200 -rw-------  1 user  staff  100000 Oct 10 13:48 file.nohole

Whereas on CentOS 5, the file without holes consumes 88 more disk blocks than the other:

$ ls -ls
total 136
 24 -rw-------  1 user   nobody 100000 Oct 10 13:46 file.hole
112 -rw-------  1 user   nobody 100000 Oct 10 13:46 file.nohole

As in other Unixes, it's a feature of the filesystem. Either the filesystem supports it for ALL files or it doesn't. Unlike Win32, you don't have to do anything special to make it happen. Also unlike Win32, there is no performance penalty for using a sparse file.

On MacOS, the default filesystem is HFS+ which does not support sparse files.

Update: MacOS used to support UFS volumes with sparse file support, but that has been removed. None of the currently supported filesystems feature sparse file support.

This thread becomes a comprehensive source of info about the sparse files. Here is the missing part for Win32:

Decent article with examples

Tool that estimates if it makes sense to make file as sparse

Regards

hdiutil can handle sparse images and files but unfortunately the framework it links against is private.

You could try defining external symbols as defined by the DiskImages framework below but this is most likely not acceptable for production code, plus since the framework is private you'd have to reverse engineer its use cases.

cristi:~ diciu$ otool -L /usr/bin/hdiutil

/usr/bin/hdiutil: /System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages (compatibility version 1.0.8, current version 194.0.0) [..]

cristi:~ diciu$ nm /System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages | awk -F' ' '{print $3}' | c++filt | grep -i sparse

[..]

CSparseFile::sector2Band(long long)

CSparseFile::addIndexNode()

CSparseFile::readIndexNode(long long, SparseFileIndexNode*)

CSparseFile::readHeaderNode(CBackingStore*, SparseFileHeaderNode*, unsigned long)

[... cut for brevity]

You could use hdiutil as an external process and have it create an sparse disk image for you. From the C process you would then create a file in the (mounted) sparse disk image.

If you want portability, the last resort is to write your own access function so that you manage an index and a set of blocks.

In essence you manage a single file as the OS manages the disk keeping the chain of the blocks that are part of the file, the bitmap of allocated/free blocks etc.

Of course this will lead to a non optimized and slower access, I would reccomend this apprach only if the requirement to save space is absolutely critical and you have enough time to write a robust set of access functions.

And even in that case, I would first investigate if your problem is in need of a different solution. Probably you should store your data differently?

If you seek (fseek, ftruncate, ...) to past the end, the file size will be increased without allocating blocks until you write to the holes. But there's no way to create a magic file that automatically converts blocks of zeroes to holes. You have to do it yourself.

This may be helpful to look at (the OpenBSD cp command inserts holes instead of writing zeroes). patch

It looks like OS X supports sparse files on UDF volumes. I tried titaniumdecoy's test program on OS X 10.9 and it did generate a sparse file on a UDF disk image. Also, not that UFS is no longer supported in OS X, so if you need sparse files, UDF is the only natively supported file system that supports them.

I also tried the program on SMB shares. When the server is Ubuntu (ext4 filesystem) the program creates a sparse file, but 'ls -ls' through SMB doesn't show that. If you do 'ls -ls' on the Ubuntu host itself it does show the file is sparse. When the server is Windows XP (NTFS filesystem) the program does not generate a sparse file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM