简体   繁体   中英

Invalid character constant in a UTF-8 character

I'm trying to assign 'o͝' (a phonetic character) to a Character in a Java program, but I get the error "Invalid character constant". My file is using UTF-8 and other phonetic characters work ok, but not this one. It looks as if this character is, in fact, two (an 'o' and a ligature or something like that), but I can not break it is forming parts.

Code example:

Character test = 'o͝';

Any help would be appreciated.

The glyph is called "small letter o with combining double breve " and can, in source, be written as;

String a = "\u006f\u035d";

Since it is a combining character (ie two characters ), the resulting value cannot be assigned to a single Java char, you'll need to use a String.

您可以尝试在字符表上查找字符号,并将其分配给变量,例如:

char a = '\u0040';

As already said, you shouldn't hardcode characters like that, you should use the unicode point values found here:

http://www.utf8-chartable.de/

What you want actually involves a "combining character":

http://en.wikipedia.org/wiki/Combining_character

The combining diacritical marks are 0x0300 - 0x036f. So, eg, to create the character you want ('o' with double breve), use:

String o_doubleBreve = "o\u035d";

Prints as o͝

I agree with the above answers that giving the \\u representation is best in any new code you happen to write, however one will come across projects with source code having this issue and supposedly they were able to compile their code. One such example I am working with now is openNLP .

Well if you run into something like this, you see that when running in an IDE like Eclipse if you follow a procedure like this , you can change the workspace default representation to be UTF-8. This will allow successful compiling of the code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM