I am receiving ByteBuffers in an UDP Java application.
Now the data in this ByteBuffer can be any string in any language or any special chars separated by zero .
I use following code to get Strings from it.
public String getString() {
byte[] remainingBytes = new byte[this.byteBuffer.remaining()];
this.byteBuffer.slice().get(remainingBytes);
String dataString = new String(remainingBytes);
int stringEnd = dataString.indexOf(0);
if(stringEnd == -1) {
return null;
} else {
dataString = dataString.substring(0, stringEnd);
this.byteBuffer.position(this.byteBuffer.position() + dataString.getBytes().length + 1);
return dataString;
}
}
These strings are stored in MySQL DB with everything set as UTF8 .
IF i run application in Windows then special chars like ® are displayed but chinese are not.
On adding VM argument -Dfile.encoding=UTF8 chinese are displayed but chars like ® are shown as ?? etc.
Please Help.
Edit:
Input Strings in UDP packet are variable-length byte field, encoded in UTF-8, terminated by 0x00
For JDBC also i use useUnicode=true&characterEncoding=UTF-8
Not sure, but dataString contains only data till this zero, because stringEnd shows on first zero postion but not behind.
dataString = dataString.substring(0, stringEnd+1);
or
char specChar = dataString.substring(stringEnd, stringEnd+1); and it should return only special character, but as I said in the biggining, not sure...
String dataString = new String(remainingBytes);
is wrong. You should almost never do that. You should find out what encoding was used to put the bytes into the UDP packet, and use the same encoding on that line:
String dataString = new String(remainingBytes, encoding); // e.g. "UTF-8"
Edit: based on your updated question, encoding
should be "UTF-8"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.