简体   繁体   中英

UTF-8 conversion for characters

I currently have a std::string and it contains this

"\xa9 2006 FooWorld"

Basically it contains the symbol © . This string is being passed to a method to an external API that takes in UTF-8. How could I make this string UTF-8 compatible ? Any suggestions. I read here I could use std::wstring_convert but I am not sure how to apply it in my case. Any suggestions would be appreciated.

That's simple: use a UTF-8 string literal:

u8"\u00A9 2006 FooWorld"

That will result in a const char[] that is a properly encoded UTF-8 string.

In C++11 and later, the best way to get a UTF-8 encoded string literal is to use the u8 prefix:

std:string str = u8"\u00A9 2006 FooWorld";

or:

std:string str = u8"© 2006 FooWorld";

However, you can use std::wstring_convert , too (especially if your input data is not a string literal):

#include <codecvt>
#include <locale>
#include <string>

std::wstring wstr = L"© 2006 FooWorld"; // or whatever...

std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> convert;

std::string str = convert.to_bytes(wstr);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM