简体   繁体   中英

ASN.1 BER Encode Integer 0x‭10000000FFFFFFFC Indefinite Length Encoded

I am implementing BER according to spec and I am asking myself the question: How would I encode an Integer 0x‭10000000FFFFFFFC as BER Integer (Tag 0x02) when I use the indefinite length encoding?

I did not find any escaping characters in the spec so far, so I would assume, that I cannot encode such a number and therefore would have to rely on the sender to know that as well and send the Integer with the Short Form Length Encoding. But in BitStrings, the same problem arises.

You NEVER use indefinite length encoding with an integer ...

It's written in the X.690 (08/2015) chapter 8.3 (Encoding of an integer value):

The encoding of an integer value shall be primitive

Indefinite length is used for constructed types (SEQUENCE, SEQUENCE OF ...) and for basic types that can contain large values (String types, BITSTRING, OCTET STRING ...). In this case the spec will say:

The encoding of a bitstring value shall be either primitive or constructed at the option of the sender

The CER encoding rules (chapter 9) give you an idea of what large value meant at the time of the writing:

 Bitstring, octetstring, and restricted character string values shall be encoded with a primitive encoding if they would require no more than 1000 contents octets, and as a constructed encoding otherwise

So you see that even a gigantic integer will always be less that 1000 bytes when encoded: hence the choice of never using the indefinite length form for an integer

AFAIK, indefinite length encoding is only defined for octet/bit strings, but not for numeric types.

Also, I do not think there is any escaping mechanism in BER.

Indefinite length encoding always contains chunks of data (octet/bit strings) encoded using definite length encoding. In BER parlance, indefinite length is always in the constructed form.

With definite length encoding you always have byte count to cut octet stream by specific position, you do not need any sentinel.

With indefinite length encoding you need those two-zeros sentinel (which is in fact a TVL triplet with zero-length value) to indicate end of data. But you never have raw eg non-encoded data (which could otherwise interfere with the sentinel) as a payload.

You are concerned about the scenario where (a) the alternative constructed encoding shown in figure 2 is used and (b) the content octets contain the octets 0x0000 which will (incorrectly) be interpreted as the End-of-contents marker.

When I first read the question, my first instinct was, surely the BER say something about this. Maybe there is an escape mechanism to avoid 0x0000 in the contents. Maybe the encoding rules are such that 0x0000 will naturally never occur.

But after a carefully scanning the X.690 spec several times, I could find nothing of the kind.

So, I think you are right: I think the sender is supposed to not use the alternative constructed encoding in such scenarios (which in practice means, that is should never use the alternative constructed encoding for datatypes that have this potential problem, eg integers and bitstrings).

As a comparison, Thrift only allows a STOP marker (byte 00) in the very specific circumstance as a "no more fields" marker when encoding a struct (see https://github.com/erikvanoosten/thrift-missing-specification/blob/master/rpc-spec-binary-protocol.asciidoc )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM