简体   繁体   中英

In Java, how can I create an InputStream for a specific part of a file?

I need an InputStream that reads from a specific portion of a file, and nothing more.

From the perspective of the consumer of the InputStream it would seem that the content is only that specific portion. The Consumer<InputStream> would be unaware its data came from a much larger file.
Therefor the InputStream should behave as follows:

  • The beginning of the file is skipped silently.
  • Then the desired portion of the file is returned.
  • Subsequent calls to the is.read() would return -1 , even if the file contained more data.
Path file= Paths.get("file.dat");
int start = 12000;
int size = 600;

try(InputStream input = getPartialInputStream(file, start, size)){
    // This should receive an inputstream that returns exactly 600 bytes.
    // Those bytes should correspond to the bytes in "file.dat" found from position 12000 upto 12600.
    thirdPartyMethod(input);
}

Is there a good way to do this without having to implement a custom InputStream myself?
What could such a getPartialInputStream method look like?

There is something called a MappedByteBuffer whose content is a memory-mapped region of a file.

Another question has an answer that shows how to map such a MappedByteBuffer to an InputStream . This lead me to this solution:

public InputStream getPartialInputStream(file, start, size) {
    try (FileChannel channel = FileChannel.open(inFile, READ)) {
        MappedByteBuffer content = channel.map(READ_ONLY, start, size);
        return new ByteBufferBackedInputStream(content);
    }
}
public class ByteBufferBackedInputStream extends InputStream {

    ByteBuffer buf;

    public ByteBufferBackedInputStream(ByteBuffer buf) {
        this.buf = buf;
    }

    public int read() throws IOException {
        if (!buf.hasRemaining()) {
            return -1;
        }
        return buf.get() & 0xFF;
    }

    public int read(byte[] bytes, int off, int len)
            throws IOException {
        if (!buf.hasRemaining()) {
            return -1;
        }

        len = Math.min(len, buf.remaining());
        buf.get(bytes, off, len);
        return len;
    }
}

Warning about locked system resources (on Windows)

MappedByteBuffer suffers from a bug where the underlying file gets locked by the mapped buffer until the buffer itself is garbage-collected, and there is no clean way around it.

So you can only use this solution if you don't have to delete/move/rename the file afterwards . Trying to would lead to a java.nio.file.AccessDeniedException (unless you're lucky enough that the buffer was already garbage collected).

I'm not sure I should be hopeful about this getting fixed anytime soon.

Depending on where the original strem comes from, you might want to discard it and return your own stream instead. If the original stream supports reset() , the user at receiving end might make the beginning data visible to themselves.

public InputStream getPartialInputStream(InputStream is, int start, int size) throws IOException {
    // Put your fast-forward logic here, might want to use is.skip() instead
    for (int i = 0; i < start; i++) {
        is.read();
    }
    // Rewrite the part of stream you want the caller to receive so that
    // they receive *only* this part
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    for (int i = 0; i < size; i++) {
        int read = is.read();
        if (read != -1) {
            baos.write(read);
        } else {
            break;
        }
    }
    is.close();
    return new ByteArrayInputStream(baos.toByteArray());
}

Edit as an answer to comment:

If it's not desirable to rewrite the stream eg due to memory constraints, you can read the start bytes as in the first loop and then return the stream with something like Guava's ByteStreams.limit(is, size) . Or subclass the stream and override read() with a counter to keep returning -1 as soon as size is read.

You could also write a temp file and return it's stream - this would prevent end-user from finding the file name with reflection from the original file's FileInputStream.

I wrote a utility class that you can use like this:

try(FileChannel channel = FileChannel.open(file, READ);
    InputStream input = new PartialChannelInputStream(channel, start, start + size)) {

    thirdPartyMethod(input);
}

It reads the content of the file using a ByteBuffer, so you control the memory footprint.

import java.io.IOException;
import java.io.InputStream;
import java.nio.BufferUnderflowException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class PartialChannelInputStream extends InputStream {

    private static final int DEFAULT_BUFFER_CAPACITY = 2048;

    private final FileChannel channel;
    private final ByteBuffer buffer;
    private long position;
    private final long end;

    public PartialChannelInputStream(FileChannel channel, long start, long end)
            throws IOException {
        this(channel, start, end, DEFAULT_BUFFER_CAPACITY);
    }

    public PartialChannelInputStream(FileChannel channel, long start, long end, int bufferCapacity)
            throws IOException {
        if (start > end) {
            throw new IllegalArgumentException("start(" + start + ") > end(" + end + ")");
        }

        this.channel = channel;
        this.position = start;
        this.end = end;
        this.buffer = ByteBuffer.allocateDirect(bufferCapacity);
        fillBuffer(end - start);
    }

    private void fillBuffer(long stillToRead) throws IOException {
        if (stillToRead < buffer.limit()) {
            buffer.limit((int) stillToRead);
        }
        channel.read(buffer, position);
        buffer.flip();
    }

    @Override
    public int read() throws IOException {
        long stillToRead = end - position;
        if (stillToRead <= 0) {
            return -1;
        }

        if (!buffer.hasRemaining()) {
            buffer.flip();
            fillBuffer(stillToRead);
        }

        try {
            position++;
            return buffer.get();
        } catch (BufferUnderflowException e) {
            // Encountered EOF
            position = end;
            return -1;
        }
    }
}

This implementation above allows to create multiple PartialChannelInputStream reading from the same FileChannel and use them concurrently.
If that's not necessary, the simplified code below takes a Path directly.

import static java.nio.file.StandardOpenOption.READ;

import java.io.IOException;
import java.io.InputStream;
import java.nio.BufferUnderflowException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.file.Path;

public class PartialFileInputStream extends InputStream {

    private static final int DEFAULT_BUFFER_CAPACITY = 2048;

    private final FileChannel channel;
    private final ByteBuffer buffer;
    private long stillToRead;

    public PartialChannelInputStream(Path file, long start, long end)
            throws IOException {
        this(channel, start, end, DEFAULT_BUFFER_CAPACITY);
    }

    public PartialChannelInputStream(Path file, long start, long end, int bufferCapacity)
            throws IOException {
        if (start > end) {
            throw new IllegalArgumentException("start(" + start + ") > end(" + end + ")");
        }

        this.channel = FileChannel.open(file, READ).position(start);
        this.buffer = ByteBuffer.allocateDirect(bufferCapacity);
        this.stillToRead = end - start;
        fillBuffer();
    }

    private void fillBuffer() throws IOException {
        if (stillToRead < buffer.limit()) {
            buffer.limit((int) stillToRead);
        }
        channel.read(buffer);
        buffer.flip();
    }

    @Override
    public int read() throws IOException {
        if (stillToRead <= 0) {
            return -1;
        }

        if (!buffer.hasRemaining()) {
            buffer.flip();
            fillBuffer();
        }

        try {
            stillToRead--;
            return buffer.get();
        } catch (BufferUnderflowException e) {
            // Encountered EOF
            stillToRead = 0;
            return -1;
        }
    }

    @Override
    public void close() throws IOException {
        channel.close();
    }
}

One small fix for @neXus 's PartialFileInputStream class, in the read() method you need to make sure a byte value 0xff doesn't get returned as -1.

return buffer.get() & 0xff;

does the trick.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM