简体   繁体   中英

Why bytes in c# are named byte and sbyte unlike other integral types?

I was just flipping through the specification and found that byte is odd. Others are short, ushort, int, uint, long, and ulong. Why this naming of sbyte and byte instead of byte and ubyte?

It's a matter of semantics. When you think of a byte you usually (at least I do) think of an 8-bit value from 0-255. So that's what byte is. The less common interpretation of the binary data is a signed value ( sbyte ) of -128 to 127.

With integers, it's more intuitive to think in terms of signed values, so that's what the basic name style represents. The u prefix then allows access to the less common unsigned semantics.

The reason a type "byte", without any other adjective, is often unsigned while a type "int", without any other adjective, is often signed, is that unsigned 8-bit values are often more practical (and thus widely used) than signed bytes, but signed integers of larger types are often more practical (and thus widely used) than unsigned integers of such types.

There is a common linguistic principle that, if a "thing" comes in two types, "usual" and "unusual", the term "thing" without an adjective means a "usual thing"; the term "unusual thing" is used to refer to the unusual type. Following that principle, since unsigned 8-bit quantities are more widely used than signed ones, the term "byte" without modifiers refers to the unsigned flavor. Conversely, since signed integers of larger sizes are more widely used than their unsigned equivalents, terms like "int" and "long" refer to the signed flavors.

As for the reason behind such usage patterns, if one is performing maths on numbers of a certain size, it generally won't matter--outside of comparisons--whether the numbers are signed or unsigned. There are times when it's convenient to regard them as signed (it's more natural, for example, to say think in terms of adding -1 to a number than adding 65535) but for the most part, declaring numbers to be signed doesn't require any extra work for the compiler except when one is either performing comparisons or extending the numbers to a larger size. Indeed, if anything, signed integer math may be faster than unsigned integer math (since unsigned integer math is required to behave predictably in case of overflow, whereas unsigned math isn't).

By contrast, since 8-bit operands must be extended to type 'int' before performing any math upon them, the compiler must generate different code to handle signed and unsigned operands; in most cases, the signed operands will require more code than unsigned ones. Thus, in cases where it wouldn't matter whether an 8-bit value was signed or unsigned, it often makes more sense to use unsigned values. Further, numbers of larger types are often decomposed into a sequence of 8-bit values or reconstituted from such a sequence. Such operations are easier with 8-bit unsigned types than with 8-bit signed types. For these reasons, among others, unsigned 8-bit values are used much more commonly than signed 8-bit values.

Note that in the C language, "char" is an odd case, since all characters within the C character set are required to translate as non-negative values (so machines which use an 8-bit char type with an EBCDIC character set are required to have "char" be unsigned), but an "int" is required to hold all values that a "char" can hold (so machines where both "char" and "int" are 16 bits are required to have "char" be signed).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM