I'm trying to send a C# string to a C++ wstring data and vice vera. (by TCP).
I succeeded at sending string data from C#(as Unicode, UTF-16) and got it into at C++ by char array.
But I have no idea how to convert char the array to a wstring.
This is what it looks like when c++ gets "abcd" with utf-16
[0] 97 'a' char
[1] 0 '\0' char
[2] 98 'b' char
[3] 0 '\0' char
[4] 99 'c' char
[5] 0 '\0' char
[6] 100 'd' char
[7] 0 '\0' char
this is what it looks like when c++ gets "한글" with utf-16
[0] 92 '\\' char
[1] -43 '?' char
[2] 0 '\0' char
[3] -82 '?' char
and this is what it looks like when c++ gets "日本語" with utf-16
[0] -27 '?' char
[1] 101 'e' char
[2] 44 ',' char
[3] 103 'g' char
[4] -98 '?' char
[5] -118 '?'char
Since UTF-8 doesn't support all Japanese character, I tried to get data via UTF-16 (which C# string basically used). But I failed to convert these char arrays to wstring by using every way that I have found.
This is what I tried before
std::wstring_convert<std::codecvt_utf16<wchar_t>> myconv
-> what wchar have to have
[0] 54620 '한' wchar_t
[1] 44544 '글' wchar_t
->What it have after using this
[0] 23765 '峕' wchar_t
[1] 174 '®' wchar_t
/
std::wstring wsTmp(s.begin(), s.end());
-> what wchar have to have
[0] 54620 '한' wchar_t
[1] 44544 '글' wchar_t
->What it have after using this
[0] 92 '\\' wchar_t
[1] 65493 'ᅰ' wchar_t
[2] 0 '\0' wchar_t
[3] 65454 'ᆴ' wchar_t
In both of them, I change char the array to a string and change it to a wstring and that failed......
Does anyone have any idea how to convert non-English UTF-16 char data to wstring data?
Add : C# side code
byte[] sendBuffer = Encoding.Unicode.GetBytes(Console.ReadLine());
clientSocket.Send(sendBuffer);
and it convert '한글' into byte like
[0] 92 byte
[1] 213 byte
[2] 0 byte
[3] 174 byte
I try to send C# string data to C++ wstring data and vice vera. (by TCP)
I succesed to send string data from C#(as Unicode, UTF-16) and get it at C++ by char array.
It would be better, and more portable, to transmit the data using UTF-8 instead of UTF-16.
But I have no idea how to convert char array to wstring.
On platforms where wchar_t
is 16bit, such as Windows (which I presume you are on, as you are using C#), you can copy your char
array content as-is directly into a std::wstring
, eg:
char *buffer = ...;
int buflen = ...;
std::wstring wstr(reinterpret_cast<wchar_t*>(buffer), buflen / sizeof(wchar_t));
If you need to support platforms where wchar_t
is 32bit instead, you can use std::wstring_convert
:
char *buffer = ...;
int buflen = ...;
std::wstring_convert<std::codecvt_utf16<wchar_t>, wchar_t> conv;
std::wstring wstr = conv.from_bytes(std::string(buffer, buflen));
// or:
// std::wstring wstr = conv.from_bytes(buffer, buffer+buflen);
Since wchar_t
is not very portable, consider using std::u16string
/ char16_t
instead (if you are using a compiler that supports C++11 or later, that is), as they were designed specifically for UTF-16 data.
Since UTF-8 dosen't support all japanese character
Yes, it does. Unicode is the actual character set, UTFs are just encodings for representing Unicode codepoints as byte sequences. ALL UTFs (UTF-7, UTF-8, UTF-16, and UTF-32) support the ENTIRE Unicode character set, and UTFs are designed to allow for loss-less conversion from one UTF to another.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.