String to byte array only converts first 16 bytes according to Intellisense

Question

I'm trying to convert a string to a byte[] using the ASCIIEncoder object in the .NET library. The string will never contain non-ASCII characters, but it will usually have a length greater than 16. My code looks like the following:

public static byte[] Encode(string packet)
{
    ASCIIEncoder enc = new ASCIIEncoder();
    byte[] byteArray = enc.GetBytes(packet);
    return byteArray;
}

By the end of the method, the byte array should be full of packet.Length number of bytes, but Intellisense tells me that all bytes after byteArray[15] are literally questions marks that cannot be observed. I used Wireshark to view byteArray after I sent it and it was received on the other side fine, but the end device did not follow the instructions encoded in byteArray . I'm wondering if this has anything to do with Intellisense not being able to display all elements in byteArray , or if my packet is completely wrong.

Answer 1

If your packet string basically contains characters in the range 0-255, then ASCIIEncoding is not what you should be using. ASCII only defines character codes 0-127; anything in the range 128-255 will get turned into question marks (as you have observed) because there characters are not defined in ASCII.

Consider using a method like this to convert the string to a byte array. (This assumes that the ordinal value of each character is in the range 0-255 and that the ordinal value is what you want.)

public static byte[] ToOrdinalByteArray(this string str)
{
    if (str == null) { throw new ArgumentNullException("str"); }

    var bytes = new byte[str.Length];
    for (int i = 0; i < str.Length; ++i) {
        // Wrapping the cast in checked() will trigger an OverflowException
        // if the character being converted is out of range for a byte.
        bytes[i] = checked((byte)str[i]);
    }

    return bytes;
}

The Encoding class hierarchy is specifically designed for handling text. What you have here doesn't seem to be text, so you should avoid using these classes.

Answer 2

The standard encoders use the replacement character fallback strategy. If a character doesn't exist in the target character set, they encode a replacement character ('?' by default).

To me, that's worse than a silent failure; It's data corruption. I prefer that libraries tell me when my assumptions are wrong.

You can derive an encoder that throws an exception:

Encoding.GetEncoding(
    "us-ascii",
    new EncoderExceptionFallback(), 
    new DecoderExceptionFallback());

If you are truly using only characters in Unicode's ASCII range then you'll never see an exception.

String to byte array only converts first 16 bytes according to Intellisense

Question

2 answers

solution1
2 ACCPTED 2014-08-14 18:45:20

solution2
2 2014-08-15 04:50:53

String to byte array only converts first 16 bytes according to Intellisense

Question

2 answers

solution1 2 ACCPTED 2014-08-14 18:45:20

solution2 2 2014-08-15 04:50:53

solution1
2 ACCPTED 2014-08-14 18:45:20

solution2
2 2014-08-15 04:50:53