简体   繁体   中英

Parsing a byte array sent over TCP in Java

I am developing an embedded system which sends some data over TCP. This system is ARM-based and its code is written in C. In C side, I have an array of char (or unsigned byte, ie uint8_t ) which represents some encoded data:

 char buffer[BUFFER_SIZE] = {0, 11, 34,176,255}; // for example.

This buffer will be sent to a server via TCP/IP protocol, using a popular GPRS module called SIM800. The connection between microcontroller and SIM800 is UART, ie standard serial communication. I can send either uint8_t or char array. It doesn't differ in C world.

At the server side, there exist some Java services that receive and parse this array.

The problem is: In C language, the uint8_t and char data types are somehow identical, ie 0 -> 255 is equal to the whole ASCII table. But as far as I know, this is not true at the server. In Java byte data type is intrinsically signed and its range is from -128 to 127. Moreover the extended ASCII characters, from 128 to 255 are somehow non-standard and differ from system to system.

The Java service receives data as String and then converts to an array of bytes.

I am confused. What will happen if I send the above-mentioned array to server. Can Java service reinterpret it?

You may try the below after reading the bytes from TCP stream

        String str = new String(bytes, 
                                         StandardCharsets.US_ASCII);

You could convert byte array to base64 and send to the java server. Java service then would convert it back to the original byte array.

The problem is that conversion between char and byte in java is not straightforward, because it involves a charset. The Latin1 or ISO-8859-1 charset is the direct conversion where the low order byte of the char is the original one, while the high order byte is 0.

So you must ensure (it should be said in the documentation of the Java service) how the service converts the input bytes to a String (what is the used charset) and then use the same charset for the reverse conversion.

The natural way would be to use a Latin1 conversion, in that case, the Java bytes would be the int8_t value of the representation of the uint_t original byte. So all bytes below 128 should be unchanged and bytes starting at 128 will receive original_value - 256 . For example 255 will be -1 and 128 will be -128.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM