简体   繁体   中英

Forward compatibility in storage size constrained protocol

I have a simple protocol consisting of lets say 4 fields:

Field-1 (4-bits)
Field-2 (6-bits)
Field-3 (4-bits)
Field-4 (2-bits)

Currently, I organize them so they are byte-aligned as:

Field-1,Field-3,Field-2,Field-4

In total, the message occupies 2 bytes with 0 bytes overhead.

To make this backwards compatible, so I can understand messages from a previous version I add a 1-byte version field at the beginning and it becomes:

Version-Field,Field-1,Field-3,Field-2,Field-4

3 bytes in total with an overhead of 1 byte.

How do I add forwards compatibility such that I can add new fields in new versions of the protocol while ensuring old versions of the software can still understand the messages, with the lowest possible overhead?

Typically, your protocol would specify that each message has:

  • a message length indicator that will work for for all future versions. This is typically either a fixed-size integer that is guaranteed to be big enough, or a variable-length-encoded integer using extension bits like you see with VLQ or UTF-8.
  • an indicator of the minimum version of the protocol that you need to understand to parse the message. This is important because new versions might introduce things that must be understood.

Each new version of the protocol then allows you to add new data to a prefix that conforms to the previous version of the protocol, and every version of the protocol has to specify how to recognize the end of the data it defines (in your example that's fixed length, so it's easy), and the start of the data defined in some future version.

To process a message, the consumer checks to make sure it is a high enough version, processes the prefix that it understands, and uses the length field to skip the rest.

For something as space-constrained as your protocol, I might do something like this:

  • The first byte is a 4-bit minimum version and a 4-bit length field.

  • If the length field L is in 0-11, then the remainder of the message is L+1 bytes long.

  • Otherwise the L-11 bytes after the first byte are an integer containing the length.
  • When the minimum version you must understand is > 15, then some version of the protocol before version 15 will define additional version information in the message.

You'll have FC by ensuring strict BC with this rule:

New version must keep field layout known to previous versions.

If you can follow the rule, you'll automatically have both BC and FC. Consequently, with the rule you can only add new fields by appending them to existing layout.

Let me explain with an example. Say that you need to add these fields for version 2:

Field-5 (1-bit)
Field-6 (7-bits)

Remember the rule, new fields can only be appended to existing layout. So, this is version 2 message layout:

Version-Field,Field-1,Field-3,Field-2,Field-4,Field-5,Field-6

Because the layout known to version 1 is intact, your version 1 code can read messages of any version with this (pseudocode):

function readMessageVersion1(byte[] input) {
    var msg = {};
    msg.version = input[0];

    msg.field1 = input[1] & 0x0f;
    msg.field3 = input[1] >> 4 & 0x0f;

    msg.field2 = input[2] & 0x3f;
    msg.field4 = input[2] >> 6 & 0x03;

    return msg;
}

Version 1 doesn't need to check the version field because the known layout is unconditional. However, version 2 and all other versions will need to check the version field. Assuming that we use the value 2 to indicate version 2, this will do (pseudocode):

function readMessageVersion2(byte[] input) {
    var msg = readMessageVersion1(input);

    //check version field
    if (msg.version < 2) return msg;

    msg.field5 = input[3] & 0x01;
    msg.field6 = input[3] >> 1 & 0x7f;

    return msg;
}

The most important part of the code is the fact that it reuses code from the previous version and this check:

if (msg.version < 2) return msg;

Version 3 of the code can simply follow version 2 like this:

function readMessageVersion3(byte[] input) {
    var msg = readMessageVersion2(input);

    //check version field
    if (msg.version < 3) return msg;

    // read the input bytes here

    return msg;
}

Think of it as a template for future versions. By following the rule and the examples, any version of the protocol can read messages from any version with just 1 byte overhead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM