简体   繁体   中英

Converting byte array with ASCII encoding to String produces weird result

I'm making a socket application in Java that receives some HTML data from the server in ASCII and then parse the data accordingly.

byte[] receivedContent = new byte[12500];
receivedSize = inputStream.read(receivedContent);
receivedContent = Arrays.copyOf(receivedContent, receivedSize+1);
if (receivedSize == -1) {
  System.out.println("ERROR! NO DATA RECEIVED");
  System.exit(-1);
}
lastReceived = new String(receivedContent, StandardCharsets.US_ASCII);

This should really be quite straight forward but it's not. I printed out some debug messages and found that despite receiving some bytes of data, (for exmaple priting receivedSize tells me its received 784 bytes), the resulting string from those bytes is only a few chars long, like this:

Ard</a></li><li><a

I'm expecting a full HTML document, and so this is clearly wrong. There's also no obvious pattern as to when might this happen. It seems totally random. Since I'm allocating new memory for the buffer there really shouldn't be any old data in it that messes with the new data from the socket. Can someone shed some light on this strange behavior? Also this seems to happen less frequently on my Windows machine running OracleJDK rather than my remote Ubunut machine that runs OpenJDK, could that be the reason and how would I fix that?

UPDATE: at the end I manually inspected the byte array's ASCII encoding against a ASCII table and found that the server is intentionally sending garbled data. Mystery solved.

Instead of using:

 inputStream.read(receivedContent);

You need to read all data from the stream. Using something like (from apache commons io):

 IOUtils.readFully(inputStream, receivedContent)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM