Copying to byte array doesn't behave as expected

Question

I have this code:

using System.Text;

var testString = "TestOneString";

var testStringBytes = Encoding.UTF8.GetBytes(testString);

var allBytes = new byte[testStringBytes.Length+2];

allBytes[0] = (byte) testStringBytes.Length;

Console.WriteLine("Length: " + allBytes[0]); // this is 13.

testStringBytes.CopyTo(allBytes,1); // It should be copied from 1 to 13. So the string is allBytes[1] to allBytes[13] or am I wrong?

var printTest = Encoding.UTF8.GetString(allBytes[1..(testStringBytes.Length)]); //allBytes[1..13]
Console.WriteLine(printTest); // this gives back: TestOneStrin

var printTest2 = Encoding.UTF8.GetString(allBytes[1..(testStringBytes.Length+1)]); // why do I need to put the+1 there? this means this is allBytes[1..14]
Console.WriteLine(printTest2); // this gives back: TestOneString (the full thing).

/*However what I don't understand is, if (testStringBytes.Length+1) is 14
 * why can I change the 14th byte to anything and the string is still going to print normally? doesn't that mean that 14th byte has nothing to do with the string?
 */

allBytes[testStringBytes.Length + 1] = (byte) (new Random().Next());

var printTest3 = Encoding.UTF8.GetString(allBytes[1..(testStringBytes.Length + 1)]);
Console.WriteLine(printTest3); // this gives back: TestOneString (the full thing).
                               // So why does it cut when I don't add +1 (which is 14, when the 14th byte has nothing to do with the string??)

The explanation of my problem is in the comments.

I'm copying string to byte array. The string has lenght of 13, and I'm starting at 1. So it should copy to byteArray[1].. byteArray[13] but, when I try to change the bytes 1 to 13 back to string, it cuts the last character out. So I need to do byteArray[1] to 14, but, that doesn't make any sense as the 14th byte has nothing to do with the string. (The printTest3, shows that if I assign anything random to 14th byte the string is still full).

This is the console output of this program:

Length: 13
TestOneStrin
TestOneString
TestOneString

Can someone explain to me why do I need to do 1 to 14, if the 14th byte can be literally anything?

Answer 1

In some environments like .NET or Java charaters may take more than one byte. Especially for UTF-8 encoding:

This property returns a UTF8Encoding object that encodes Unicode (UTF-16-encoded) characters into a sequence of one to four bytes per character, and that decodes a UTF-8-encoded byte array to Unicode (UTF-16-encoded) characters. For information about the character encodings supported by .NET and a discussion of which Unicode encoding to use, see Character Encoding in .NET.

For reference .

For that reason you can't assume that your 13 characters string will take 13 bytes.

Copying to byte array doesn't behave as expected

Question

1 answers

solution1
1 2022-03-02 12:32:35

Copying to byte array doesn't behave as expected

Question

1 answers

solution1 1 2022-03-02 12:32:35

solution1
1 2022-03-02 12:32:35