简体   繁体   中英

How to convert UTF-16 to UTF-8 using C++?

  • I already know 'codecvt', 'WideCharToMultiByte', and someone.

I use korean language. For example. '안녕하세요'.

It message can insert normal string class. right?

But in my case. If i have file :: 'test.txt' {in :: '안녕하세요'}

And read 'test.txt', and getline(),

(test.txt file read)
string temp;
getline(file pointer, temp);
cout<<temp;

Now i use cout. Ta-Da! message are broken!

I know that is WideCharacter problem. so i tring MultiByteToWideChar method.

Ok. It is work well.

But i not want this.

Finally I want reading widecharcter files, and save 'string' Variable.

So, I question for you.

How to convert UTF-16 (widecharcter/wstring) to UTF-8 (multibyte/string) when 'Not change message' ?

:: I want this style

wstring temp = "안녕하세요"

string temp2 = convert_to_string(temp);

->

string temp2 = "안녕하세요"

As mentioned in the comment, you can see Convert C++ std::string to UTF-16-LE encoded string for the code on how to do the conversion.

But given you assumed you have wstring to hold your Korean string, you avoided the trouble of distinguishing UTF-16-LE and UTF-16-BE and you can readily find the Unicode code point of each Korean character in the string. So your problem boils down to find the UTF-8 representation of any code point. It would not be hard, see page 3 of https://tools.ietf.org/html/rfc3629 (also Wikipedia https://en.wikipedia.org/wiki/UTF-8 ).

A sample code is in Convert Unicode code points to UTF-8 and UTF-32

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM