So I am trying to make a modification to some software that is written in C# but I am not really a developer. The code reads data from a client and gets values from it. The problem I am seeing is that when you have values from the client that use non english characters it becomes jibberish. The code in question is:
public static string ReadNT(BinaryReader stream)
{
ret = "";
byte addByte = 0x00;
do {
addByte = ReadByte(stream);
if (addByte != 0x00)
ret += (char)addByte;
} while (addByte != 0x00);
return ret;
}
As far as I can tell it is going through the stream and converting things to a character one by one to get the string. The problem with that is it doesn't work with unicode/utf8. Is there a way to convert this into a string that works with utf8 values?
Try this:
public static string ReadNT(BinaryReader stream)
{
List<byte> bytes = new List<byte>();
byte addByte = 0x00;
do
{
addByte = ReadByte(stream);
if (addByte != 0x00)
{
bytes.Add((char)addByte);
}
} while (addByte != 0x00);
return Encoding.UTF8.GetString(bytes.ToArray());
}
You can't convert the characters one at a time, as some could be expressed in more than one byte, hence my use of the List<byte>
to gather up the whole stream.
I think the big caveat here is that you will need to be sure that the client is sending you UTF8 formatted text.
Edit:
Further to the comments to this answer, from Can UTF-8 contain zero byte?
Yes, the zero byte in UTF8 is code point 0, NUL. There is no other Unicode code point that will be encoded in UTF8 with a zero byte anywhere within it.
Therefore it is safe to assume that if you receive a zero byte, it is NUL and isn't actually part of a code point.
You could try and use the StreamReader class to read the UTF8 string.
public static string ReadNT(BinaryReader stream)
{
return (new StreamReader(stream, Encoding.UTF8, false)).ReadString();
}
You should consider transferring the size of the string in addition to the string itself if that is something you have control over.
public static string ReadNT(BinaryReader stream, int length)
{
return Encoding.UTF8.GetString(stream.ReadBytes(length));
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.