简体   繁体   中英

what's the difference between std::codecvt and std::codecvt_utf8

there is a question makes me feel confused. What the exactly difference between std::codecvt and std::codecvt_utf8? As the STL reference saying, std::codecvt_utf8 is a drived class from std::codecvt, but could you please tell me why this function would throw an exception?

std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> cvtUtf8 { new std::codecvt_byname<wchar_t, char, std::mbstate_t>(".65001") };
std::wstring_convert<std::codecvt_utf8<wchar_t>> cvt_utf8;

std::string strUtf8 = cvt_utf8.to_bytes(L"你好");
std::string strUtf8Failed = cvtUtf8.to_bytes(L"你好"); // throw out an exception. bad conversion

codecvt is a template intended to be used as a base of a conversion facet for converting strings between different encodings and different sizes of code units. It has a protected destructor, which practically prevents it from being used without inheritance.

codecvt<wchar_t, char, mbstate_t> specialization in particular is a conversion facet for "conversion between the system's native wide and the single-byte narrow character sets" .

codecvt_utf8 inherits codecvt and is facet is for conversion between "UTF-8 encoded byte string and UCS2 or UCS4 character string" . It has a public destructor.

If the system native wide encoding is not UCS2 or UCS4 or if system native narrow encoding isn't UTF-8, then they do different things.


could you please tell me why this function would throw an exception?

Probably because the C++ source file was not encoded in the same encoding as the converter expects the input to be.


 new std::codecvt<wchar_t, char, std::mbstate_t>(".65001") 

codecvt has no constructor that accepts a string.


It might be worth noting that codecvt and wstring_convert have been deprecated since C++17.

which one is the instead of codecvt?

The standard committee chose to deprecate codecvt before providing an alternative. You can either keep using it - with the knowledge that it may be replaced by something else in future, and with the knowledge that it has serious shortcomings that are cause for deprecation - or you can do what you could do prior to C++11: implement the conversion yourself, or use a third party implementation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM