[英]Decoding a string c#
I created TCP server that is distributing client's messages and run on a problem. 我创建了用于分发客户端消息的TCP服务器,并在出现问题时运行。 When I'm sending Cyrillic messages through stream they're not decoding properly.
当我通过流发送西里尔字母消息时,它们无法正确解码。 Anyone knows how can I repair that?
谁知道我该如何修复?
Here's the code for sending the message: 这是发送消息的代码:
var message = Console.ReadLine().ToCharArray().Select(x => (byte)x).ToArray();
stream.Write(message);`
Here's the code for receiving: 这是接收代码:
var numberOfBytes = stream.Read(buffer,0,1024);
Console.WriteLine($"{numberOfBytes} bytes received");
var chars = buffer.Select(x=>(char)x).ToArray();
var message = new string(chars);
The problem is that a character in C# represents a 2-byte UTF-16 character. 问题是C#中的字符代表2字节的UTF-16字符。 A cyrillic character is bigger than 255 in UTF-16, so you lose information when converting it to a byte.
西里尔字母大于UTF-16中的255,因此将其转换为字节时会丢失信息。
To convert a string to a byte array, use the Encoding class: 要将字符串转换为字节数组,请使用Encoding类:
byte[] buffer = System.Text.Encoding.UTF8.GetBytes(Console.ReadLine());
To convert it back to a string on the receiver's end, write: 要将其转换回接收者端的字符串,请输入:
string message = System.Text.Encoding.UTF8.GetString(buffer);
Another problem is that Stream.Read does not guarantee to read all bytes of your message at once (Your stream does not know that you send packets with a certain size). 另一个问题是Stream.Read不能保证一次读取消息的所有字节(您的流不知道您发送的是一定大小的数据包)。 So it could happen, for example, that the last byte of the received byte array is only the first byte of a 2-byte character, and you receive the other byte the next time you call Stream.Read.
因此,可能会发生这样的情况,例如,接收到的字节数组的最后一个字节只是2字节字符的第一个字节,而下次调用Stream.Read时又收到另一个字节。
There are several solutions to this issue: 有几种解决此问题的方法:
To convert a string to bytes, use System.Text.Encoding.GetBytes(string)
. 要将字符串转换为字节,请使用
System.Text.Encoding.GetBytes(string)
。 I suggest you change the sending code to: 我建议您将发送代码更改为:
// using System.Text;
var messageAsBytes = Encoding.UTF8.GetBytes(Console.ReadLine());
To convert bytes to a string, use System.Text.Encoding.GetString(byte[])
. 要将字节转换为字符串,请使用
System.Text.Encoding.GetString(byte[])
。 If you receive UTF-8-encoded bytes: 如果您收到UTF-8编码的字节:
// using System.Text;
var messageAsString = Encoding.UTF8.GetString(buffer);
Some suggested reading: 一些建议阅读:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.