简体   繁体   中英

What is the best practice to grow a very large binary file rapidly?

My Java application deals with large binary data files using memory mapped file (MappedByteBuffer, FileChannel and RandomAccessFile). It often needs to grow the binary file - my current approach is to re-map the file with a larger region.

It works, however there are two problems

  1. Grow takes more and more time as the file becomes larger.
  2. If grow is conducted very rapidly (EG in a while(true) loop), JVM will hang forever after the re-map operation is done for about 30,000+ times.

What are the alternative approaches, and what is the best way to do this?

Also I cannot figure out why the second problem occurs. Please also suggest your opinion on that problem.

Thank you!

Current code for growing a file, if it helps:

(set! data (.map ^FileChannel data-fc FileChannel$MapMode/READ_WRITE
                         0 (+ (.limit ^MappedByteBuffer data) (+ DOC-HDR room))))

You probably want to grow your file in larger chunks. Use a doubling each time you remap, like a dynamic array , so that the cost for growing is an amortized constant.

I don't know why the remap hangs after 30,000 times, that seems odd. But you should be able to get away with a lot less than 30,000 remaps if you use the scheme I suggest.

The JVM doesn't clean up memory mappings even if you call the cleaner explicitly. Thank you @EJP for the correction.

If you create 32,000 of these they could be all in existence at once. BTW: I suspect you might be hitting some 15-bit limit.

The only solution for this is; don't create so many mapping. You can map an entire disk 4 TB disk with less than 4K mapping.

I wouldn't create a mapping less than 16 to 128 MB if you know the usage will grow and I would consider up to 1 GB per mapping. The reason you can do this with little cost is that the main memory and disk space will not be allocated until you actually use the pages. ie the main memory usage grows 4 KB at a time.

The only reason I wouldn't create a 2 GB mapping is Java doesn't support this due to an Integer.MAX_VALUE size limit :( If you have 2 GB or more you have to create multiple mappings.

Unless you can afford an exponential growth on the file like doubling, or any other constant multiplier, you need to consider whether you really need a MappedByteBuffer at all, considering their limitations (unable to grow the file, no GC, etc). I personally would either be reviewing the problem or else using a RandomAccessFile in "rw" mode, probably with a virtual-array layer over the top of it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM