简体   繁体   中英

Special unicode characters cause exceptions in MySQL JDBC

I have isolated a problem we are running into down to a simple test:

Try to run a straight up JDBC insert or update on a longtext column type with the parameter value new String(new char[]{0xDBFF, 0xDC00});

An exception occurs stating: "Incorrect string value: '\\xF4\\x8F\\xB0\\x80' for column"

It appears that these two characters when paired together, form a valid Chinese symbol (individually they are meaningless)

How can I deal with these messed up characters? They form a valid symbol and Character.isDefined returns true for both characters. Stripping out specifically those character codes from all strings seems like it would be begging for more problems with different combinations of Chinese characters.

Encoded with UFT-8 this character will result in 4 bytes:

11110100 10001111 10110000 10000000

MySQL 5.0/5.1 does not support 4byte UTF8-characters, this is a known limitation. MySQL 5.5 does support 4byte UTF8-characters.

See 9.1.10. Unicode Support

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM