简体   繁体   中英

ASN.1 / DER Encoding of Integers

I'm currently starting to work with DER (Distinguished Encoding Rules) encoding and have problems understanding the encoding of integers.

In the reference document https://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf this encoding is defined as follows:

8.3.1 The encoding of an integer value shall be primitive. The contents octets shall consist of one or more octets.

8.3.2 If the contents octets of an integer value encoding consist of more than one octet, then the bits of the first octet and bit 8 of the second octet:

  1. shall not all be ones; and

  2. shall not all be zero.

NOTE – These rules ensure that an integer value is always encoded in the smallest possible number of octets.

8.3.3 The contents octets shall be a two's complement binary number equal to the integer value, and consisting of bits 8 to 1 of the first octet, followed by bits 8 to 1 of the second octet, followed by bits 8 to 1 of each octet in turn up to and including the last octet of the contents octets.

On another site, https://docs.microsoft.com/en-us/windows/desktop/seccertenroll/about-integer , it is explained that for positive numbers whose binary representation starts with a 1, a zero byte is added at the front. This is also mentioned in the answers to a former question on stackoverflow: ASN Basic Encoding Rule of an integer .

Unfortunately, from these answers I cannot see how this latter instruction can be deduced from the rules of the reference document.

For example, if I want to encode the number 128, why can't I do this as

[tag byte] [length byte] 10000000?

I know that the correct encoding would be [tag byte] [length byte] 00000000 10000000, but which condition is injured by the variant above? Probably it has something do to with the two's complement, but isn't the two's complement of 128 again 10000000?

I hope you can help me understand why the description on the Microsoft site is equivalent to the original definition. Thank you.

The Two's complement rule (8.3.3) says that if the high bit of the first (lowest index) content byte is set that the number is negative.

02 01 80 has contents 0b1000_0000 . Since the high bit is set, the number is negative.

Flip all the bits ( 0b0111_1111 ), then add one: 0b1000_0000 ; meaning that it represents negative 128.

For a less degenerate example, 0b1000_0001 => 0b0111_1110 => 0b0111_1111 , showing that 0x81 is negative 127.

For the number (positive) 127, since the high bit isn't set, the number is interepreted as positive, so the contents are just 0b0111_1111 aka 0x7F , resulting in 02 01 7F

Common pattern in ASN.1 is TLV, Type / Length / Value

Type: One Octet, 0x02 for Integers

Value: Two's complement with sign constraint is answered above.

Length coding has two modi:

  1. Most significant bit of first length octet is not set: Then the octet is the content length itself.

  2. Most significant bit of first length octet is set: Then the first octet is followed by (value-128) octets forming the actual length as non-negative integer, byte order big endian.

Lengthes 0...127 go with first rule, 128... with second one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM