简体   繁体   中英

Sending large data over TCP/IP socket

I have a small project running a server in C# and a client in Java . The server sends images to the client. Some images are quite big (up to 10MiB sometimes), so I split the image bytes and send it in chunks of 32768 bytes each. My C# Server code is as follows:

using (var stream = new MemoryStream(ImageData))
{
   for (int j = 1; j <= dataSplitParameters.NumberOfChunks; j++)
   {
      byte[] chunk;
      if (j == dataSplitParameters.NumberOfChunks)
         chunk = new byte[dataSplitParameters.FinalChunkSize];
      else
         chunk = new byte[dataSplitParameters.ChunkSize];

      int result = stream.Read(chunk, 0, chunk.Length);

      string line = DateTime.Now + ", Status OK, " + ImageName+ ", ImageChunk, " + j + ", " + dataSplitParameters.NumberOfChunks + ", " + chunk.Length;

      //write read params
      streamWriter.WriteLine(line);
      streamWriter.Flush();
      
      //write the data
      binaryWriter.Write(chunk);
      binaryWriter.Flush();
      Console.WriteLine(line);

      string deliveryReport = streamReader.ReadLine();
      Console.WriteLine(deliveryReport);
     }
  }

And my Java Client code is as follows:

long dataRead = 0;
for (int j = 1; j <= numberOfChunks; j++) {
    String line = bufferedReader.readLine();
    tokens = line.split(", ");
    System.out.println(line);

    int toRead = Integer.parseInt(tokens[tokens.length - 1]);
    byte[] chunk = new byte[toRead];
    int read = inputStream.read(chunk, 0, toRead);
    //do something with the data
    dataRead += read;

    String progressReport = pageLabel + ", progress: " + dataRead + "/" + dataLength + " bytes.";
    bufferedOutputStream.write((progressReport + "\n").getBytes());
    bufferedOutputStream.flush();

    System.out.println(progressReport);
}

The problem is when I run the code, either the client crashes with an error saying it is reading bogus data, or both the client and the server hang. This is the error:

Document Page 1, progress: 49153/226604 bytes.
�9��%>�YI!��F�����h�
Exception in thread "main" java.lang.NumberFormatException: For input string: .....

What am I doing wrong?

The basic problem.

Once you wrap an inputstream into a bufferedreader you must stop accessing the inputstream . That bufferedreader is buffered , it will read as much data as it wants to, it is NOT limited to reading exactly up to the next newline symbol(s) and stopping there.

The BufferedReader on the java side has read a lot more than that, so it's consumed a whole bunch of image data already, and there's no way out from here. By making that BufferedReader, you've made the job impossible, so you can't do that.

The underlying problem.

You have a single TCP/IP connection. On this, you send some irrelevant text (the page, the progress, etc), and then you send an unknown amount of image data, and then you send another irrelevant progress update.

That's fundamentally broken. How can an image parser possibly know that halfway through sending an image, you get a status update line? Text is just binary data too, there is no magic identifier that lets a client know: This byte is part of the image data, but this byte is some text sent in-between with progress info.

The simple fix.

You'd think the simple fix is.. well, stop doing that then? Why are you sending this progress, The client is perfectly capable of knowing how many bytes it read. there is no point sending that. Just.. take your binary data. open the outputstream. send all that data, And on the client side, open the inputstream. read all that data. Don't involve strings, Don't use anything that smacks of 'works with characters' (so? BufferedReader. No. BufferedInputStream is fine).

... but now the client doesn't know the title, nor the total size!

So make a wire protocol. It can be near trivial.

This is your wire protocol:

  1. 4 bytes, big endian: SizeOfName
  2. SizeOfName number of bytes. UTF-8 encoded document title.
  3. 4 bytes, big endian: SizeOfData
  4. SizeOfData number of bytes. The image data.

And that's if you actually want the client to be able to render a progress bar and to know the title. If that's not needed, don't do any of that, just straight up send the bytes, and signal that the file has been completely sent by.. closing the connection.

Here's some sample java code:

try (InputStream in = ....) {
  int nameSize = readInt(in);
  byte[] nameBytes = in.readNBytes(nameSize);
  String name = new String(nameBytes, StandardCharsets.UTF_8);
  int dataSize = readInt(in);
  try (OutputStream out = 
    Files.newOutputStream(Paths.get("/Users/TriSky/image.png")) {

    byte[] buffer = new byte[65536];
    while (dataSize > 0) {
      int r = in.read(buffer);
      if (r == -1) throw new IOException("Early end-of-stream");
      out.write(buffer, 0, r);
      dataSize -= r;
    }
  }
}

public int readInt(InputStream in) throws IOException {
    byte[] b = in.readNBytes(4);
    return ByteBuffer.wrap(b).getInt();
}

Closing notes

Another bug in your app is that you're using the wrong method. Java's 'read(bytes)' method will NOT (neccessarily) fully fill that byte array. All read(byte[]) will do is read at least 1 byte (unless the stream is closed, then it reads none, and returns -1. The idea is: read will read the optimal number of bytes: Exactly as many as are ready to give you right now. How many is that? Who knows - if you ignore the returned value of in.read(bytes), your code is neccessarily broken, and you're doing just that. What you really want is for example readNBytes which guarantees that it fully fills that byte array (or until stream ends, whichever happens first).

Note that in the transfer code above, I also use the basic read, but here I don't ignore the return value.

Your Java code seems to be using a BufferedReader . It reads data into a buffer of its own, meaning it is no longer available in the underlying socket input stream - that's your first problem. You have a second problem with how inputStream.read is used - it's not guaranteed to read all the bytes you ask for, you would have to put a loop around it.

This is not a particularly easy problem to solve. When you mix binary and text data in the same stream, it is difficult to read it back. In Java, there is a class called DataInputStream that can help a little - it has a readLine method to read a line of text, and also methods to read binary data:

DataInputStream dataInput = new DataInputStream(inputStream);

for (int j = 1; j <= numberOfChunks; j++) {
    String line = dataInput.readLine();
    ...
    byte[] chunk = new byte[toRead];
    int read = dataInput.readFully(chunk);
    ...
}

DataInputStream has limitations: the readLine method is deprecated because it assumes the text is encoded in latin-1, and does not let you use a different text encoding. If you want to go further down this road you'll want to create a class of your own to read your stream format.

Some images are quite big (up to 10MiB sometimes), so I split the image bytes and send it in chunks of 32768 bytes each.

You know this is totally unnecessary right? There is absolutely no problem sending multiple megabytes of data into a TCP socket, and streaming all of the data in on the receiving side.

When you try to send image, you have to open the image as a normal file then substring the image into some chunks and every chunk change it into " base64encode " when you send and the client decode it because the image data is not normal data , so base64encode change this symbols to normal chars like AfHM65Hkgf7MM

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM