简体   繁体   中英

How to store multiple utf8 symbols (uint32_ts) from utfcpp as a string?

Using the utfcpp lib, one could split a string ( '哈哈哈' ) encoded in utf8 into several uint32_t s (or symbols (21704, 21704, 21704) ) which act like char s for std::string .

In this situation, what's the best solution store the uint32_t ('character') sequences (as a 'string')?

For example, putting (21704, 21704, 21704) into a vector<uint32_t> will require iterating the vector for 'string comparison', which seems slower than the real version of std::string .

Thanks in advance.

Either use std::wstring or your own brew std::basic_string<uint32_t> .

This would let you use their operators and functions to manipulate such objects.

Modern versions of C++ come with char16_t and char32_t . They should be prefered to uintxx_t types because clause 24.2 Character traits [char.traits] mandates the definition of specialization of char_traits for it:

This subclause defines requirements on classes representing character traits, and defines a class template char_traits<charT> , along with four specializations, char_traits<char> , char_traits<char16_t> , char_traits<char32_t> , and char_traits<wchar_t> , that satisfy those requirements.

This even allows immediate access to a basic_string specialization: 24.3 String classes [string.classes] says

The header <string> defines the basic_string class template for manipulating varying-length sequences of char-like objects and four typedef-names, string , u16string , u32string , and wstring , that name the specializations basic_string<char> , basic_string<char16_t> , basic_string<char32_t> , and basic_string<wchar_t> , respectively.

Unfortunately, when it comes to direct io no such specializations exists out of the box for basic_stream<char32_t> , but UTF8 locales should have converters between char32_t and char .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM