简体   繁体   中英

What determines character encoding in C++

I'm developing for Windows, the software is a client-server solution, the client is written in C# and the server is written in C++.

Today I wanted to develop a simple function: sending a string from the client side, receive it on the server side, and write it to an xml file.

My problem is that the characters I see on the server side are ANSI encoded. According by my knowledge, C# string is encoded by unicode, why does my server side c++ app encode the string as ANSI? I think my communication module doesn't modify the string.

Well, C# strings are UTF16 encoded.
you might want to use std::u16string instead of the regular std::string

other option is to encode the C# string into/from different encoding like UTF8 with the System.Text.Encoding class.

Since IO actions are much more slow than CPU actions, and since IO action time is also porportional to the memory size involved, and UTF8 is usually leaner than UTF16, the common practice is to communicate through the web with UTF8.

hence, I'll go with converting the C# strings to UTF8 before actually sending them and use std::string in the server side.

Keep in mind though, that std::string is not that UTF8 aware, so writing something like str[0] might not give you the full UTF8 sequence, but only the first character of it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM