简体繁体中英

what are (some of the) UTF8 string functions in C

原文 2012-01-07 10:05:45 6 2 c/ linux/ gcc/ unicode

For dealing with ASCII we have strlen, strcat etc.. For UTF16(ie, UCS2) we have wcscat and wcslen functions.

For dealing with UTF8 and UCS4 what are the functions available in C? Assume Linux/gcc

2 answers

I don't think standard C libraries have UTF-8 functions. There are surely libraries for it.

However, normal str functions can be used with UTF-8 in many cases.
strlen works well, returning the number of bytes (not characters). strcat works (it also overruns your buffer easily, but this is normal for strcat).

The reason is that the 0 character can't appear in multi-byte UTF-8 data. So if it appears in a UTF-8 string, it's surely its end, just like in ASCII.

The standard does not specify the encoding or size used for the wide character functions, so assuming it to be UCS2, UCS4 or anything else is not portable. C11 brings standardized unicode support, but I think it's to early to rely on that being implemented yet. Your best bet is to find a library to handle conversion to/from UTF8/UCS4 or any other encoding you may need.

Have a look at iconv , or the chapter on character handling in the GNU C library manual.

C String encoding UTF8 without libiconv

What is the locale of UTF8?

UTF8 processing in C

Strip invalid utf8 from string in c/c++

Objective-C / C Convert UTF8 Literally to Real string

How do I index a (not all ascii) utf8 string in C?

Removing diacritic symbols from UTF8 string in C

utf8 strings and malloc in c

Utf8 Linux filenames and C

Required to convert a String to UTF8 string

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question C String encoding UTF8 without libiconv What is the locale of UTF8? UTF8 processing in C Strip invalid utf8 from string in c/c++ Objective-C / C Convert UTF8 Literally to Real string How do I index a (not all ascii) utf8 string in C? Removing diacritic symbols from UTF8 string in C utf8 strings and malloc in c Utf8 Linux filenames and C Required to convert a String to UTF8 string

Related Tags

what are (some of the) UTF8 string functions in C

Question

2 answers

solution1
3 ACCPTED 2012-01-07 10:22:50

solution2
3 2012-01-07 10:25:51

what are (some of the) UTF8 string functions in C

Question

2 answers

solution1 3 ACCPTED 2012-01-07 10:22:50

solution2 3 2012-01-07 10:25:51

solution1
3 ACCPTED 2012-01-07 10:22:50

solution2
3 2012-01-07 10:25:51