简体   繁体   中英

How to handle special characters in Java?

I want to save a comment which is given by the user in DB as a CLOB. It's working fine. Later I got issue with special characters. If a user copy pastes the comment from a WordPad and it contains "single quote" or some special characters(they are bit different from usual) they are converting into reversed question mark or some square box . I tried to handle them by using below code. 在此处输入图片说明

values[4] = new String(values[4].getBytes("ISO-8859-1"), "UTF-8");

But still I'm getting square boxes. After debugging the issue what I realized is, it is not able to handle a space . Please see the attached image

Note: the comment length is 122 and it failed to handle only one space. I don't know what's wrong with that space.

Note that in java the encoding matters only when

  1. doing some sort of (file-)IO or
  2. converting characters to bytes

Java's String -objects are always encoded as UTF-16, so assuming that values is a String[] your code is doing the following:

  1. Take the String values[4] as a set of characters.
  2. Transform each character to one byte using ISO8859-1-encoding
  3. Use UTF8-encoding to convert these bytes to characters.

eg the £ -character will be converted to the byte-value A3 but that single byte can not be converted back using UTF-8 since it could only be part of a 2-byte-sequence.

To sum it up: that codeline is completely broken, while using String -objects there is no need to think about any kind of encoding. Where you have to take care of codepage issues is while converting to bytes, be it during I/O to a file or network-Stream or when converting to byte-arrays for encryption.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM