简体   繁体   English

TCP文件传输中缺少字节

[英]Missing bytes in TCP file transfer

I need to be able to read a file, break it into packets of some arbitrary size, lets say 512 bytes, and send those packets over TCP. 我需要能够读取一个文件,将其分成任意大小的数据包,比如512字节,然后通过TCP发送这些数据包。 Problem is, the receiver isn't getting all the bytes that im sending. 问题是,接收器没有得到我发送的所有字节。 If I send 1000 packets, the receiver blocks when reading from InputStream because he has no more data to read around 990 packets or so. 如果我发送1000个数据包,接收器在从InputStream读取时会阻塞,因为他没有更多的数据可以读取大约990个数据包左右。

Here's the code (just the sending and receiving parts): 这是代码(只是发送和接收部分):

Sender: 发件人:

int parts = (int)Math.ceil((double)(file.length()/512.0));

out.println(parts+"");
int readFile;
int i = 0;
while ((readFile = fileIn.read(buffer)) != -1) {
   i++;
   fileOut.write(buffer, 0, readFile);
   fileOut.flush();
   System.out.println("-- Sent packet " + i + "/" + parts + ". " + "Bytes sent = " + readFile);
}

Receiver: 接收器:

int parts = Integer.parseInt(in.readLine());
byte[] buffer = new byte[512];
FileOutputStream pw = new FileOutputStream("file.ext");
DataInputStream fileIn = new DataInputStream(socket.getInputStream());
for(int j = 0; j < parts; j++){
   int read = 0;
   if(j == parts - 1){
      read = fileIn.read(buffer);
      pw.write(buffer, 0, read);
   }else{
      fileIn.readFully(buffer);
      pw.write(buffer);
   }
   System.out.println("-- Received packet " + (j+1) + "/" + parts + ". Read " +read+ " bytes.");
}

I tried increasing the socket's send and receive buffer size but without success. 我尝试增加套接字的发送和接收缓冲区大小但没有成功。 What am I missing? 我错过了什么?

Here is an output sample: 这是一个输出样本:

Sender: 发件人:

-- Sent packet 1/10. Bytes sent = 512
-- Sent packet 2/10. Bytes sent = 512
-- Sent packet 3/10. Bytes sent = 512
-- Sent packet 4/10. Bytes sent = 512
-- Sent packet 5/10. Bytes sent = 512
-- Sent packet 6/10. Bytes sent = 512
-- Sent packet 7/10. Bytes sent = 512
-- Sent packet 8/10. Bytes sent = 512
-- Sent packet 9/10. Bytes sent = 512
-- Sent packet 10/10. Bytes sent = 234

Receiver: 接收器:

-- Received packet 1/10. Read 512 bytes.
-- Received packet 2/10. Read 512 bytes.
-- Received packet 3/10. Read 512 bytes.
-- Received packet 4/10. Read 512 bytes.
-- Received packet 5/10. Read 512 bytes.
-- Received packet 6/10. Read 512 bytes.
-- Received packet 7/10. Read 512 bytes. (And it blocks here, because there is no more data to read)

TCP is a stream protocol . TCP是一种流协议 There are no packets, just a single stream of data. 没有数据包,只有一个数据流。

You shouldn't be making the assumption that a single write() (with or without a flush() ) will correspond to a single read() . 您不应该假设单个write() (带或不带flush() )将对应于单个read() Therefore your receive loop for(int j = 0; j < parts; j++) is misguided: a better approach would be to count the number of bytes read an be prepared for read() calls returning varying amounts of data. 因此for(int j = 0; j < parts; j++)的接收循环是错误的:更好的方法是计算为read()调用准备的read() 字节数,返回不同数量的数据。

In the comments you argue that readFully() takes care of the problem. 在评论中你认为readFully()负责解决问题。 However, my concern lies not so much with the code as with the packet-based view of the stream. 但是,我关注的不是代码,而是基于数据包的流视图。 This leads to bugs like that in your final fileIn.read(buffer) call. 这导致最终fileIn.read(buffer)调用中的错误。 It might return half the data that got sent as part of your last "packet", and you'll never know! 它可能会返回作为您上一个“数据包”的一部分发送的数据的一半,而您永远不会知道!

break it into packets of some arbitrary size 把它分成任意大小的数据包

Why? 为什么? It's a streaming protocol. 这是一个流媒体协议。 You can break it up as much as you like at the sending end, but the local TCP will then do its best to combine your writes into larger TCP segments; 您可以在发送端尽可能多地分解它,但本地TCP将尽最大努力将您的写入组合到更大的TCP段中; the local IP will then chunk that into MTU-sized IP packets; 然后,本地IP将其分成MTU大小的IP数据包; intermediate routers may fragment that further; 中间路由器可以进一步分段; the remote IP will then reassemble the fragments into packets; 然后远程IP将片段重新组装成数据包; the remote TCP will then reassemble the packets into segments; 然后,远程TCP将数据包重新组合成段; and the remote application will then receive in chunks dependent on all that as well as on the size of its socket receive buffer in the kernel, and on the size of its application receive buffer. 然后,远程应用程序将依赖于所有内容以及内核中套接字接收缓冲区的大小以及应用程序接收缓冲区的大小来接收块。

Don't attempt to out-think all that processing. 不要试图超越所有的处理。 You can't. 你不能。 Just write what you have to write when you have it to write. 只要写下你要写的东西就可以了。

If you have 'missing bytes' it can only be because you are ignoring the length value returned by read() at the receiver, or if you are using non-blocking I/O at the sender, the length value returned by write() at the sender. 如果你有'缺少字节',那只能是因为你忽略了接收器上read()返回的长度值,或者如果你在发送器上使用非阻塞I / O,则write()返回的长度值在发件人。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM