简体   繁体   中英

How can I convert a std::string to UTF-8?

I need to put a stringstream as a value of a JSON (using rapidjson library), but std::stringstream::str is not working because it is not returning UTF-8 characters. How can I do that?

Example: d["key"].SetString(tmp_stream.str());

rapidjson::Value::SetString accepts a pointer and a length. So you have to call it this way:

std::string stream_data = tmp_stream.str();
d["key"].SetString(tmp_stream.data(), tmp_string.size());

As others have mentioned in the comments, std::string is a container of char values with no encoding specified. It can contain UTF-8 encoded bytes or any other encoding.

I tested putting invalid UTF-8 data in an std::string and calling SetString . RapidJSON accepted the data and simply replaced the invalid characters with "?". If that's what you're seeing, then you need to:

  1. Determine what encoding your string has
  2. Re-encode the string as UTF-8

If your string is ASCII, then SetString will work fine as ASCII and UTF-8 are compatible.

If your string is UTF-16 or UTF-32 encoded, there are several lightweight portable libraries to do this like utfcpp . C++11 had an API for this, but it was poorly supported and now deprecated as of C++17.

If your string encoded with a more archaic encoding like Windows-1252, then you might need to use either an OS API like MultiByteToWideChar on Windows, or use a heavyweight Unicode library like LibICU to convert the data to a more standard encoding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM