简体繁体 English

ByteArrayOutputStream（int size）中的缓冲区到底是什么？

[英]What is the buffer in ByteArrayOutputStream(int size) exactly?

原文 2018-05-27 09:42:50 5 2 java/ io

I understand what a buffer is when writing to a file - OS-file-writing (calling native API - one method call for one char) is costly, so many chars/bytes are collected in a buffer and a buffer is written to file with one OS API call. 我了解写入文件时缓冲区是什么-OS文件写入（调用本机API-一种方法调用一个字符）的成本很高，因此在缓冲区中收集了许多字符/字节，并且使用一个OS API调用。

But what buffer is meant here? 但是这里的缓冲区是什么意思？ And why? 又为什么呢？

ByteArrayOutputStream(int size) - Creates a new byte array output stream, with a buffer capacity of the specified size , in bytes. ByteArrayOutputStream（int size） -创建一个新的字节数组输出流，具有指定大小的缓冲区容量（以字节为单位）。

ByteArrayOutputStream() has 32 bytes buffer by default, that is why Apache Commons have exactly same class org.apache.commons.io.output.ByteArrayOutputStream different only by buffer size and mechanism: "The original implementation only allocates 32 bytes at the beginning. As this class is designed for heavy duty it starts at 1024 bytes. In contrast to the original it doesn't reallocate the whole memory block but allocates additional buffers. This way no buffers need to be garbage collected and the contents don't have to be copied to the new buffer. This class is designed to behave exactly like the original. " ByteArrayOutputStream（）默认情况下具有32个字节的缓冲区，这就是Apache Commons具有完全相同的类org.apache.commons.io.output.ByteArrayOutputStream的原因，仅在缓冲区大小和机制上有所不同：“原始实现仅在开始时分配32个字节。由于此类是为重载而设计的，因此它起始于1024个字节，与原始类相比，它不会重新分配整个内存块，而是分配额外的缓冲区，这样就无需对缓冲区进行垃圾收集，而不必将其内容复制到新的缓冲区。此类的行为与原始缓冲区完全一样。”

Besides in ByteArrayInputStream(byte[] buf) as I understand that "buf" (buffer) is actually a source of data (bytes) to be fed into InputStream (ByteArrayInputStream emulates InputStream from byte array), so the word buffer here is confusing in my opinion. 除了在ByteArrayInputStream(byte[] buf) ，据我了解，“ buf”（缓冲区）实际上是要馈送到InputStream的数据（字节）的源 （ByteArrayInputStream从字节数组模拟InputStream）， 因此此处的单词buffer令人困惑我的意见。

2 个解决方案

This class implements an output stream in which the data is written into a byte array . 此类实现输出流，在该流中，数据被写入字节数组 。 The buffer automatically grows as data is written to it. 缓冲区随着数据写入而自动增长。

The two bold terms are synonymous. 这两个黑体字是同义词。 The buffer is the byte[] array that holds the bytes written to the stream. 缓冲区是byte[]数组，用于保存写入流的字节。

The buffer size is analogous to the capacity of an ArrayList . 缓冲区的大小类似于ArrayList的容量。 If you write more than 32 bytes to the stream then it has to grow the buffer, which involves allocating a new array and copying the bytes from old to new. 如果向流中写入的字节数超过32个，则它必须增加缓冲区，这涉及分配新数组并将字节从旧复制到新。 A default "capacity" of 32 is inefficient if you know you'll be writing more than that. 如果您知道自己将编写更多的内容，则默认的“容量” 32是无效的。

The Javadoc says: Javadoc说：

This class implements an output stream in which the data is written into a byte array. 此类实现输出流，在该流中，数据被写入字节数组。 The buffer automatically grows as data is written to it. 缓冲区随着数据写入而自动增长。

So in the space of two sentences, it has used two different terms. 因此，在两个句子之间，它使用了两个不同的术语。 There are numerous other examples in the same doc. 同一文档中还有许多其他示例。

On the one hand, this might be confusing if you don't know that they are referring to the same thing; 一方面，如果您不知道他们指的是同一件事，这可能会造成混淆。 it might be clearer if it said something like: 如果它说类似以下内容，可能会更清楚：

is written to a buffer, implemented as a byte array. 被写入缓冲区，实现为字节数组。

But I think it's a simple fact that, once you know (or assume, since this is quite a common thing) they refer to the same thing, is no longer especially confusing. 但是，我认为这是一个简单的事实，一旦您知道（或假设，因为这是很普遍的事情），他们所指的是同一件事，就不再特别令人困惑。