[英]How is InputStream managed in memory?
I am familiar with the concept of InputStream
,buffers and why they are useful (when you need to work with data that might be larger then the machines RAM for example) . 我熟悉InputStream
的概念,缓冲区以及它们有用的原因(例如,当您需要处理可能比机器RAM大的数据时) 。
I was wondering though, how does the InputStream
actually carry all that data?. 我想知道, InputStream
如何实际携带所有数据? Could a OutOfMemoryError
be caused if there is TOO much data being transfered? 如果转移了太多数据,是否会导致OutOfMemoryError
?
If I connect from a client to a server,requesting a 100GB file, the server starts iterating through the bytes of the file with a buffer, and writing the bytes back to the client with outputStream.write(byte[])
. 如果我从客户端连接到服务器,请求100GB文件,服务器开始使用缓冲区迭代文件的字节,并使用outputStream.write(byte[])
将字节写回客户端。 The client is not ready to read the InputStream
right now,for whatever reason. 无论出于何种原因,客户端现在还没有准备好读取InputStream
。 Will the server continue sending the bytes of the file indefinitely? 服务器会继续无限期地发送文件的字节吗? and if so, won't the outputstream/inputstream
be larger than the RAM of one of these machines? 如果是这样, outputstream/inputstream
不会大于其中一台机器的RAM吗?
InputStream
and OutputStream
implementations do not generally use a lot of memory. InputStream
和OutputStream
实现通常不会占用大量内存。 In fact, the word "Stream" in these types means that it does not need to hold the data, because it is accessed in a sequential manner -- in the same way that a stream can transfer water between a lake and the ocean without holding a lot of water itself. 实际上,这些类型中的“流”一词意味着它不需要保存数据,因为它是以顺序方式访问的 - 就像流可以在湖泊和海洋之间传输水而不保持相同的方式很多水本身。
But "stream" is not the best word to describe this. 但“流”并不是描述这一点的最佳词汇。 It's more like a pipe, because when you transfer data from a server to a client, every stage transfers back-pressure from the client that controls the rate at which data gets sent. 它更像是一个管道,因为当您将数据从服务器传输到客户端时,每个阶段都会从客户端传输反压力 ,从而控制数据发送的速率。 This is similar to how your faucet controls the rate of flow through your pipes all the way to the city reservoir: 这类似于您的水龙头如何控制通过管道一直到城市水库的流量:
InputStream
only requests more data from the OS when its internal (small) buffers are empty. 当客户端读取数据时,它的InputStream
只在内部(小)缓冲区为空时才从OS请求更多数据。 Each request allows only a limited amount of data to be transferred; 每个请求只允许传输有限数量的数据; OutputStream
, the OutputStream
will try to write data to the OS. 作为服务器端进程write()s到其OutputStream
, OutputStream
将尝试将数据写入操作系统。 When the OS buffer is full, it will make the server process wait until the server-side buffer has space to accept new data. 当OS缓冲区已满时,它将使服务器进程等待,直到服务器端缓冲区有空间接受新数据。 Notice that a slow client can make the server process take a very long time. 请注意,慢速客户端可能会使服务器进程花费很长时间。 If you're writing a server, and you don't control the clients, then it's very important to consider this and to ensure that there are not a lot of server-side resources tied up while a long data transfer takes place. 如果您正在编写服务器,并且您无法控制客户端,那么考虑这一点非常重要,并确保在进行长时间数据传输时不会占用大量服务器端资源。
Your question is as interesting as difficult to answer properly. 你的问题很难回答,很难回答。
InputStream
and OutputStream
are not a storage means, but an access means: They describe that the data shall be accessed in sequential, unidirectional order , but not how it shall be stored. 第一: InputStream
和OutputStream
不是存储装置,但是访问意味着:它们描述了数据应该按顺序,单向顺序访问,而不是如何存储。 The actual way of storing the data is implementation-dependent . 存储数据的实际方式取决于实现 。 So, would there be an InputStream that stores the whole amount of data simultaneally in memory? 那么,是否会有一个InputStream同时在内存中存储全部数据? Yes, could be, though it would be an appalling implementation. 是的,可能是,虽然这将是一个令人震惊的实施。 The most common and sensitive implementation of InputStreams / OutputStreams is by storing just a fixed and short amount of data into a temporary buffer of 4K-8K, for example. 例如,InputStreams / OutputStreams最常见和最敏感的实现是将固定和短时间的数据存储到4K-8K的临时缓冲区中。
(So far, I supposed you already knew that, but it was necessary to tell.) (到目前为止,我认为你已经知道了,但有必要告诉。)
How many time will the server block? 服务器会阻塞多少时间? Typically, a server should have a security timeout to ensure that long blocks will break the connection, thus releasing the blocked thread. 通常,服务器应具有安全超时,以确保长块将断开连接,从而释放阻塞的线程。 The same should have the client. 同样应该有客户。
The timeouts set for the connection depend on the implementation, and the protocol. 为连接设置的超时取决于实现和协议。
No, it does not need to hold all data. 不,它不需要保存所有数据。 I just advances forward in the file (usually using buffered data). 我只是在文件中前进(通常使用缓冲数据)。 The stream can discard old buffers as it pleases. 流可以随意丢弃旧缓冲区。
Note that there are aa lot of very different implementations of inputstreams, so the exact behaviour varies a lot. 请注意,输入流有很多非常不同的实现,因此确切的行为会有很大差异。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.