简体   繁体   中英

Converting UTF-8 Characters to Upper/Lower case C++

I have a string that contains UTF-8 Characters, and I have a method that is supposed to convert every character to either upper or lower case, this is easily done with characters that overlap with ASCII, and obviously some characters cannot be converted, eg any Chinese character. However is there a good way to detect and convert other characters that can be Upper/Lower, eg all the greek characters? Also please note that I need to be able to do this on both Windows and Linux.

Thank you,

Have a look at ICU .

Note that lower case to upper case functions are locale-dependant. Think about the turkish (ascii) letter I which gets "dotless lowercase i" and (ascii) i which gets "uppercase I with a dot".

Assuming that you have access to wctype.h, then convert your text to a 2-byte unicode string and use towupper(). Then convert it back to UTF-8.

On Linux, or with a standard library that supports it, you would obtain a std::locale object for the appropriate locale, as uppercase conversion is locale-specific. Convert each UTF-8 character to a wchar_t , then call std::toupper() on it, then convert back to UTF-8. Note that the resulting string might be longer or shorter, and some ligatures might not work properly: ß to Ss in German is the example everyone keeps bringing up.

On Windows, this approach will work even less of the time, because wide characters are UTF-16 and not a fixed-width encoding (which violates the C++ language standard, but then maybe the standards committee shouldn't have tried to bluff Microsoft into breaking the Windows API). There is a ToUpper method in the CLR.

It is probably easier to use a portable library such as ICU.

Also make sure whether what you want is uppercase (capitalizing every letter) or titlecase (capitalizing the first letter of a string, or the first part of a ligature).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM