简体   繁体   English

复制到字节数组的行为不符合预期

[英]Copying to byte array doesn't behave as expected

I have this code:我有这段代码:

using System.Text;

var testString = "TestOneString";

var testStringBytes = Encoding.UTF8.GetBytes(testString);

var allBytes = new byte[testStringBytes.Length+2];

allBytes[0] = (byte) testStringBytes.Length;

Console.WriteLine("Length: " + allBytes[0]); // this is 13.

testStringBytes.CopyTo(allBytes,1); // It should be copied from 1 to 13. So the string is allBytes[1] to allBytes[13] or am I wrong?

var printTest = Encoding.UTF8.GetString(allBytes[1..(testStringBytes.Length)]); //allBytes[1..13]
Console.WriteLine(printTest); // this gives back: TestOneStrin

var printTest2 = Encoding.UTF8.GetString(allBytes[1..(testStringBytes.Length+1)]); // why do I need to put the+1 there? this means this is allBytes[1..14]
Console.WriteLine(printTest2); // this gives back: TestOneString (the full thing).

/*However what I don't understand is, if (testStringBytes.Length+1) is 14
 * why can I change the 14th byte to anything and the string is still going to print normally? doesn't that mean that 14th byte has nothing to do with the string?
 */

allBytes[testStringBytes.Length + 1] = (byte) (new Random().Next());

var printTest3 = Encoding.UTF8.GetString(allBytes[1..(testStringBytes.Length + 1)]);
Console.WriteLine(printTest3); // this gives back: TestOneString (the full thing).
                               // So why does it cut when I don't add +1 (which is 14, when the 14th byte has nothing to do with the string??)

The explanation of my problem is in the comments.我的问题的解释在评论中。

I'm copying string to byte array.我正在将字符串复制到字节数组。 The string has lenght of 13, and I'm starting at 1. So it should copy to byteArray[1].. byteArray[13] but, when I try to change the bytes 1 to 13 back to string, it cuts the last character out.该字符串的长度为 13,我从 1 开始。所以它应该复制到 byteArray[1]..byteArray[13] 但是,当我尝试将字节 1 到 13 改回字符串时,它会删除最后一个性格出来。 So I need to do byteArray[1] to 14, but, that doesn't make any sense as the 14th byte has nothing to do with the string.所以我需要对 14 执行 byteArray[1],但是,这没有任何意义,因为第 14 个字节与字符串无关。 (The printTest3, shows that if I assign anything random to 14th byte the string is still full). (printTest3 表明,如果我将任何随机分配给第 14 个字节,则字符串仍然是满的)。

This is the console output of this program:这是这个程序的控制台 output:

Length: 13
TestOneStrin
TestOneString
TestOneString

Can someone explain to me why do I need to do 1 to 14, if the 14th byte can be literally anything?有人可以向我解释为什么我需要做 1 到 14,如果第 14 个字节实际上可以是任何东西吗?

In some environments like .NET or Java charaters may take more than one byte.在某些环境中,例如 .NET 或 Java,字符可能占用超过一个字节。 Especially for UTF-8 encoding:特别是对于 UTF-8 编码:

This property returns a UTF8Encoding object that encodes Unicode (UTF-16-encoded) characters into a sequence of one to four bytes per character, and that decodes a UTF-8-encoded byte array to Unicode (UTF-16-encoded) characters.此属性返回 UTF8Encoding object,它将 Unicode(UTF-16 编码)字符编码为每个字符一到四个字节的序列,并将 UTF-8 编码字节数组解码为 Unicode(UTF-16 编码)字符。 For information about the character encodings supported by .NET and a discussion of which Unicode encoding to use, see Character Encoding in .NET.有关 .NET 支持的字符编码的信息以及使用哪种 Unicode 编码的讨论,请参阅 .NET 中的字符编码。

For reference .参考

For that reason you can't assume that your 13 characters string will take 13 bytes.出于这个原因,您不能假设您的 13 个字符的字符串将占用 13 个字节。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM