I have a Spark Java web service that receives requests in UTF-8. When extended characters like umlauts or letters with tildes are received, they don't correctly contain the correct character when converted to a string. To debug:
1) I receive the request and display its bytes as Hex values (this contains the correct characters).
2) I then convert the received bytes to a string (specifying the charset of UTF-8).
3) Finally, I again display the string from step 2 as Hex values.
Unfortunately, the hex values from step 1 don't match the hex values from step 3. Below is the code I'm using:
byte[] bytes = request.bodyAsBytes();
LOGGER.debug( "1 - Body as bytes: " );
LOGGER.debug( javax.xml.bind.DatatypeConverter.printHexBinary(bytes) );
LOGGER.debug( "1 - End of body" );
// charset hard coded to UTF-8 for testing...
String charSet = requestHeadersDto.getCharacterSet().equals( "" ) ? DEFAULT_CHAR_SET : requestHeadersDto.getCharacterSet();
LOGGER.debug( "Charset: " + charSet );
String xml = new String( bytes , charSet );
LOGGER.debug( "2 - Body as bytes: " );
LOGGER.debug( javax.xml.bind.DatatypeConverter.printHexBinary( xml.getBytes() ) );
LOGGER.debug( "2 - End of body" );
What am I doing wrong? TIA.
xml.getBytes()
Should be:
xml.getBytes(charSet)
or
xml.getBytes(Charset.forName(charSet))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.