繁体   English   中英

缓冲的 RandomAccessFile java

[英]Buffered RandomAccessFile java

RandomAccessFile 随机访问文件的速度很慢。 您经常阅读有关在其上实现缓冲层的信息,但无法在线找到执行此操作的代码。

所以我的问题是:你们知道这个类的任何开源实现会共享一个指针还是共享你自己的实现?

如果这个问题能成为关于这个问题的有用链接和代码的集合,我敢肯定,很多人都共享这些问题,但 SUN 从未正确解决过这些问题,那就太好了。

请不要提及 MemoryMapping,因为文件可能比 Integer.MAX_VALUE 大。

您可以使用以下代码从 RandomAccessFile 创建 BufferedInputStream,

 RandomAccessFile raf = ...
 FileInputStream fis = new FileInputStream(raf.getFD());
 BufferedInputStream bis = new BufferedInputStream(fis);

一些注意事项

  1. 关闭 FileInputStream 将关闭 RandomAccessFile,反之亦然
  2. RandomAccessFile 和 FileInputStream 指向相同的位置,因此从 FileInputStream 中读取将使 RandomAccessFile 的文件指针前进,反之亦然

可能你想使用它的方式是这样的,

RandomAccessFile raf = ...
FileInputStream fis = new FileInputStream(raf.getFD());
BufferedInputStream bis = new BufferedInputStream(fis);

//do some reads with buffer
bis.read(...);
bis.read(...);

//seek to a a different section of the file, so discard the previous buffer
raf.seek(...);
bis = new BufferedInputStream(fis);
bis.read(...);
bis.read(...);

好吧,我看不出有什么理由不使用 java.nio.MappedByteBuffer,即使文件比 Integer.MAX_VALUE 大。

显然,您将不被允许为整个文件定义单个 MappedByteBuffer。 但是您可以让多个 MappedByteBuffers 访问文件的不同区域。

FileChannenel.map 中位置和大小的定义是 long 类型,这意味着您可以提供超过 Integer.MAX_VALUE 的值,您唯一需要注意的是您的缓冲区的大小不会大于 Integer.MAX_VALUE .

因此,您可以像这样定义多个映射:

buffer[0] = fileChannel.map(FileChannel.MapMode.READ_WRITE,0,2147483647L);
buffer[1] = fileChannel.map(FileChannel.MapMode.READ_WRITE,2147483647L, Integer.MAX_VALUE);
buffer[2] = fileChannel.map(FileChannel.MapMode.READ_WRITE, 4294967294L, Integer.MAX_VALUE);
...

总之,大小不能大于 Integer.MAX_VALUE,但起始位置可以在文件中的任何位置。

Java NIO一书中,作者 Ron Hitchens 指出:

通过内存映射机制访问文件比通过传统方式读取或写入数据更有效,即使在使用通道时也是如此。 不需要进行显式系统调用,这可能很耗时。 更重要的是,操作系统的虚拟内存系统会自动缓存内存页。 这些页面将使用系统内存进行缓存,并且不会消耗 JVM 内存堆中的空间。

一旦内存页有效(从磁盘引入),就可以以全硬件速度再次访问它,而无需进行另一个系统调用来获取数据。 包含索引或其他经常引用或更新的部分的大型结构化文件可以从内存映射中受益匪浅。 当结合文件锁定来保护关键部分和控制事务原子性时,您将开始了解如何充分利用内存映射缓冲区。

我真的怀疑您会发现第三方 API 做得比这更好。 也许您可能会发现在此架构之上编写的 API 来简化工作。

您不认为这种方法应该适合您吗?

RandomAccessFile 随机访问文件的速度很慢。 您经常阅读有关在其上实现缓冲层的信息,但无法在线找到执行此操作的代码。

嗯,网上是可以查到的。
首先,jpeg2000 中的 JAI 源代码有一个实现,以及一个更加不受阻碍的实现: http : //www.unidata.ucar.edu/software/netcdf-java/

文档:

http://www.unidata.ucar.edu/software/thredds/v4.3/netcdf-java/v4.0/javadoc/ucar/unidata/io/RandomAccessFile.html

如果您在 64 位机器上运行,那么内存映射文件是您最好的方法。 只需将整个文件映射到一个大小相等的缓冲区数组中,然后根据需要为每个记录选择一个缓冲区(即edalorzo的答案,但是您需要重叠缓冲区,以便您没有跨越边界的记录)。

如果您在 32 位 JVM 上运行,那么您将无法使用RandomAccessFile 但是,您可以使用它来读取包含整个记录的byte[] ,然后使用ByteBuffer从该数组中检索各个值。 在最坏的情况下,您应该需要进行两次文件访问:一次检索记录的位置/大小,另一次检索记录本身。

但是,请注意,如果您创建了大量的byte[] ,您可能会开始强调垃圾收集器,并且如果您在整个文件中反弹,您将保持 IO 绑定。

Apache PDFBox 项目有一个很好的经过测试的BufferedRandomAccessFile类。
在 Apache 许可下获得许可,版本 2.0

它是JavaWorld.com上 Nick Zhang 所描述的java.io.RandomAccessFile类的优化版本。 基于jmzreader实现并增强以处理无符号字节。

在此处查看源代码:

  1. https://github.com/apache/pdfbox/.../fontbox/ttf/BufferedRandomAccessFile.java
  2. https://svn.apache.org/repos/asf/pdfbox/.../fontbox/ttf/BufferedRandomAccessFile.java
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.RandomAccessFile;

/**
 * Adds caching to a random access file.
 * 
 * Rather than directly writing down to disk or to the system which seems to be
 * what random access file/file channel do, add a small buffer and write/read from
 * it when possible. A single buffer is created, which means reads or writes near 
 * each other will have a speed up. Read/writes that are not within the cache block 
 * will not be speed up. 
 * 
 *
 */
public class BufferedRandomAccessFile implements AutoCloseable {

    private static final int DEFAULT_BUFSIZE = 4096;

    /**
     * The wrapped random access file, we will hold a cache around it.
     */
    private final RandomAccessFile raf;

    /**
     * The size of the buffer
     */
    private final int bufsize;

    /**
     * The buffer.
     */
    private final byte buf[];


    /**
     * Current position in the file.
     */
    private long pos = 0;

    /**
     * When the buffer has been read, this tells us where in the file the buffer
     * starts at.
     */
    private long bufBlockStart = Long.MAX_VALUE;


    // Must be updated on write to the file
    private long actualFileLength = -1;

    boolean changeMadeToBuffer = false;

    // Must be update as we write to the buffer.
    private long virtualFileLength = -1;

    public BufferedRandomAccessFile(File name, String mode) throws FileNotFoundException {
        this(name, mode, DEFAULT_BUFSIZE);
    }

    /**
     * 
     * @param file
     * @param mode how to open the random access file.
     * @param b size of the buffer
     * @throws FileNotFoundException
     */
    public BufferedRandomAccessFile(File file, String mode, int b) throws FileNotFoundException {
        this(new RandomAccessFile(file, mode), b);
    }

    public BufferedRandomAccessFile(RandomAccessFile raf) throws FileNotFoundException {
        this(raf, DEFAULT_BUFSIZE);
    }

    public BufferedRandomAccessFile(RandomAccessFile raf, int b) {
        this.raf = raf;
        try {
            this.actualFileLength = raf.length();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        this.virtualFileLength = actualFileLength;
        this.bufsize = b;
        this.buf = new byte[bufsize];
    }

    /**
     * Sets the position of the byte at which the next read/write should occur.
     * 
     * @param pos
     * @throws IOException
     */
    public void seek(long pos) throws IOException{
        this.pos = pos;
    }

    /**
     * Sets the length of the file.
     */
    public void setLength(long fileLength) throws IOException {
        this.raf.setLength(fileLength);
        if(fileLength < virtualFileLength) {
            virtualFileLength = fileLength;
        }
    }

    /**
     * Writes the entire buffer to disk, if needed.
     */
    private void writeBufferToDisk() throws IOException {
        if(!changeMadeToBuffer) return;
        int amountOfBufferToWrite = (int) Math.min((long) bufsize, virtualFileLength - bufBlockStart);
        if(amountOfBufferToWrite > 0) {
            raf.seek(bufBlockStart);
            raf.write(buf, 0, amountOfBufferToWrite);
            this.actualFileLength = virtualFileLength;
        }
        changeMadeToBuffer = false;
    }

    /**
     * Flush the buffer to disk and force a sync.
     */
    public void flush() throws IOException {
        writeBufferToDisk();
        this.raf.getChannel().force(false);
    }

    /**
     * Based on pos, ensures that the buffer is one that contains pos
     * 
     * After this call it will be safe to write to the buffer to update the byte at pos,
     * if this returns true reading of the byte at pos will be valid as a previous write
     * or set length has caused the file to be large enough to have a byte at pos.
     * 
     * @return true if the buffer contains any data that may be read. Data may be read so long as
     * a write or the file has been set to a length that us greater than the current position.
     */
    private boolean readyBuffer() throws IOException {
        boolean isPosOutSideOfBuffer = pos < bufBlockStart || bufBlockStart + bufsize <= pos;

        if (isPosOutSideOfBuffer) {

            writeBufferToDisk();

            // The buffer is always positioned to start at a multiple of a bufsize offset.
            // e.g. for a buf size of 4 the starting positions of buffers can be at 0, 4, 8, 12..
            // Work out where the buffer block should start for the given position. 
            long bufferBlockStart = (pos / bufsize) * bufsize;

            assert bufferBlockStart >= 0;

            // If the file is large enough, read it into the buffer.
            // if the file is not large enough we have nothing to read into the buffer,
            // In both cases the buffer will be ready to have writes made to it.
            if(bufferBlockStart < actualFileLength) {
                raf.seek(bufferBlockStart);
                raf.read(buf);
            }

            bufBlockStart = bufferBlockStart;
        }

        return pos < virtualFileLength;
    }

    /**
     * Reads a byte from the file, returning an integer of 0-255, or -1 if it has reached the end of the file.
     * 
     * @return
     * @throws IOException 
     */
    public int read() throws IOException {
        if(readyBuffer() == false) {
            return -1;
        }
        try {
            return (buf[(int)(pos - bufBlockStart)]) & 0x000000ff ; 
        } finally {
            pos++;
        }
    }

    /**
     * Write a single byte to the file.
     * 
     * @param b
     * @throws IOException
     */
    public void write(byte b) throws IOException {
        readyBuffer(); // ignore result we don't care.
        buf[(int)(pos - bufBlockStart)] = b;
        changeMadeToBuffer = true;
        pos++;
        if(pos > virtualFileLength) {
            virtualFileLength = pos;
        }
    }

    /**
     * Write all given bytes to the random access file at the current possition.
     * 
     */
    public void write(byte[] bytes) throws IOException {
        int writen = 0;
        int bytesToWrite = bytes.length;
        {
            readyBuffer();
            int startPositionInBuffer = (int)(pos - bufBlockStart);
            int lengthToWriteToBuffer = Math.min(bytesToWrite - writen, bufsize - startPositionInBuffer);
            assert  startPositionInBuffer + lengthToWriteToBuffer <= bufsize;

            System.arraycopy(bytes, writen,
                            buf, startPositionInBuffer,
                            lengthToWriteToBuffer);
            pos += lengthToWriteToBuffer;
            if(pos > virtualFileLength) {
                virtualFileLength = pos;
            }
            writen += lengthToWriteToBuffer;
            this.changeMadeToBuffer = true;
        }

        // Just write the rest to the random access file
        if(writen < bytesToWrite) {
            writeBufferToDisk();
            int toWrite = bytesToWrite - writen;
            raf.write(bytes, writen, toWrite);
            pos += toWrite;
            if(pos > virtualFileLength) {
                virtualFileLength = pos;
                actualFileLength = virtualFileLength;
            }
        }
    }

    /**
     * Read up to to the size of bytes,
     * 
     * @return the number of bytes read.
     */
    public int read(byte[] bytes) throws IOException {
        int read = 0;
        int bytesToRead = bytes.length;
        while(read < bytesToRead) {

            //First see if we need to fill the cache
            if(readyBuffer() == false) {
                //No more to read;
                return read;
            }

            //Now read as much as we can (or need from cache and place it
            //in the given byte[]
            int startPositionInBuffer = (int)(pos - bufBlockStart);
            int lengthToReadFromBuffer = Math.min(bytesToRead - read, bufsize - startPositionInBuffer);

            System.arraycopy(buf, startPositionInBuffer, bytes, read, lengthToReadFromBuffer);

            pos += lengthToReadFromBuffer;
            read += lengthToReadFromBuffer;
        }

        return read;
    }

    public void close() throws IOException {
        try {
            this.writeBufferToDisk();
        } finally {
            raf.close();
        }
    }

    /**
     * Gets the length of the file.
     * 
     * @return
     * @throws IOException
     */
    public long length() throws IOException{
        return virtualFileLength;
    }

}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM