简体   繁体   中英

PHP writing large amounts of files to one directory

I'm using PHP to make a simple caching system, but I'm going to be caching up to 10,000 files in one run of the script. At the moment I'm using a simple loop with

$file = "../cache/".$id.".htm";
$handle = fopen($file, 'w');
fwrite($handle, $temp);
fclose($handle);

($id being a random string which is assigned to a row in a database)

but it seems a little bit slow, is there a better method to doing that? Also I read somewhere that on some operating systems you can't store thousands and thousands of files in one single directory, is this relevant to CentOS or Debian? Bare in mind this folder may well end up having over a million small files in it.

Simple questions I suppose but I don't want to get scaling this code and then find out I'm doing it wrong, I'm only testing with chaching 10-30 pages at a time at the moment.

Remember that in UNIX, everything is a file.

When you put that many files into a directory, something has to keep track of those files. If you do an :-

ls -la

You'll probably note that the '.' has grown to some size. This is where all the info on your 10000 files is stored.

Every seek, and every write into that directory will involve parsing that large directory entry.

You should implement some kind of directory hashing system. This'll involve creating subdirectories under your target dir.

eg.

/somedir/a/b/c/yourfile.txt /somedir/d/e/f/yourfile.txt

This'll keep the size of each directory entry quite small, and speed up IO operations.

The number of files you can effectively use in one directory is not op. system but filesystem dependent.

You can split your cache dir effectively by getting the md5 hash of the filename, taking the first 1, 2 or 3 characters of it and use it as a directory. Of course you have to create the dir if it's not exsists and use the same approach when retrieving files from cache.

For a few tens of thousands, 2 characters (256 subdirs from 00 to ff) would be enough.

File I/O in general is relatively slow. If you are looping over 1000's of files, writing them to disk, the slowness could be normal.

I would move that over to a nightly job if that's a viable option.

You may want to look at memcached as an alternative to filesystems. Using memory will give a huge performance boost.

http://php.net/memcache/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM