How do you decide what byte[] size to use for InputStream.read()?

Question

When reading from InputStreams, how do you decide what size to use for the byte[]?

int nRead;
byte[] data = new byte[16384]; // <-- this number is the one I'm wondering about

while ((nRead = is.read(data, 0, data.length)) != -1) {
  ...do something..
}

When do you use a small one vs a large one? What are the differences? Does the number want to be in increments of 1024? Does it make a difference if it is an InputStream from the network vs the disk?

Thanks much, I can't seem to find a clear answer elsewhere.

Answer 1

Most people use powers of 2 for the size. If the buffer is at least 512 bytes, it doesn't make much difference ( < 20% )

For network the optimal size can be 2 KB to 8 KB (The underlying packet size is typically up to ~1.5 KB) For disk access, the fastest size can be 8K to 64 KB. If you use 8K or 16K you won't have a problem.

Note for network downloads, you are likely to find you usually don't use the whole buffer. Wasting a few KB doesn't matter much for 99% of use cases.

Answer 2

In that situation, I always use a reasonable power of 2, somewhere in the range of 2K to 16K. In general, different InputStreams will have different optimal values, but there is no easy way to determine the value.

In order to determine the optimal value, you'd need to understand more about the exact type of InputStream you are dealing with, as well as things like the specifications of the hardware that are servicing the InputStream.

Worrying about this is probably a case of premature optimization.

Answer 3

It mostly depends on how much memory you have and how much data you expect to read. You don't want to block too often, so consider BenCole 's answer; on the other hand, you don't want to process a small chunk of data if your processing is slower than the actual reading.

I personally try to use a library and offload the task of choosing a buffer size to library authors. After that, I promise myself never read the library code, because it makes me mad.

Answer 4

I'd also say that, if reading from an InputStream (not from a ReadableByteChannel like a FileChannel or a SocketChannel ), you should not care, as long as you're wrapping it in a BufferedInputStream with a "correct" buffer size: the internal buffer will take care of the reads for you so you can focus on just reading the pieces you need.

In that case, the buffer size is probably what you're looking for and I would redirect you to @Peter Lawrey's answer : 2-8KB when the data is accessed from network, or 32-64KB when it's from hard drive (a "chunk" of disk).

When reading from a ByteChannel though, you'll have to do the buffering yourself through a ByteBuffer that you can allocate with that value.

Answer 5

By using the available() method in the InputStream class. From the Javadoc:

Returns the number of bytes that can be read (or skipped over) from this input stream without blocking by the next caller of a method for this input stream. The next caller might be the same thread or or another thread.

How do you decide what byte[] size to use for InputStream.read()?

Question

5 answers

solution1
27 ACCPTED 2012-01-05 20:09:24

solution2
4 2012-01-05 20:22:40

solution3
3 2012-01-05 20:11:26

solution4
0 2020-05-07 11:29:12

solution5
0 2012-01-05 20:05:08

How do you decide what byte[] size to use for InputStream.read()?

Question

5 answers

solution1 27 ACCPTED 2012-01-05 20:09:24

solution2 4 2012-01-05 20:22:40

solution3 3 2012-01-05 20:11:26

solution4 0 2020-05-07 11:29:12

solution5 0 2012-01-05 20:05:08

solution1
27 ACCPTED 2012-01-05 20:09:24

solution2
4 2012-01-05 20:22:40

solution3
3 2012-01-05 20:11:26

solution4
0 2020-05-07 11:29:12

solution5
0 2012-01-05 20:05:08