简体   繁体   中英

Bitwise shifting for Base64 encode function in C++

I'm trying to rewrite this javascript base64 encode routine in C++ (please note that it is non standard base64, and has a . at the beginning of the decode string).

Here is an example of the JS script - https://jsfiddle.net/km53844e/1/

The javascript base64 class I have posted beneath.

In the JS script it correctly converts CcnK to CMlaKA . However, in the C++ script, it incorrectly converts this to CMlaKr (not sure why, could it be something to do with the additional NaN statements in the JS script?

Or could it possibly be to do with the null terminated string in the C++ script? I notice that if I remove this I get CMlaKs which is still not correct.

I've tried adding the following to deal with the isnans but its not working.

  if (isnan(char_array_4[1])) {
            char_array_4[2] = char_array_4[1] = 64;
        } else if (isnan(char_array_4[2])) {
            char_array_4[3] = 64;
        }

C++ Code:

std::string base64_encode(unsigned char const* bytes_to_encode, unsigned int in_len) {
  std::string ret;
  int i = 0;
  int j = 0;
  unsigned char char_array_3[3];
  unsigned char char_array_4[4];

  while (in_len--) {
    char_array_3[i++] = *(bytes_to_encode++);
    if (i == 3) {
      char_array_4[0] = char_array_3[0] & 0x3f;
      char_array_4[1] = ((char_array_3[0] & 0x0f) << 2) + ((char_array_3[1] & 0xc0) >> 6);
      char_array_4[2] = ((char_array_3[1] & 0x03) << 4) + ((char_array_3[2] & 0xf0) >> 4);
      char_array_4[3] = (char_array_3[2] & 0xfc) >> 2;

        if (isnan(char_array_4[1])) {
            char_array_4[2] = char_array_4[1] = 64;
        } else if (isnan(char_array_4[2])) {
            char_array_4[3] = 64;
        }

      for(i = 0; (i < 4) ; i++)
        ret += base64_chars[char_array_4[i]];
      i = 0;
    }
  }

  if (i)
  {
    for(j = i; j < 3; j++)
    char_array_3[j] = '\0';

    char_array_4[0] = char_array_3[0] & 0x3f;
    char_array_4[1] = ((char_array_3[0] & 0x0f) << 2) + ((char_array_3[1] & 0xc0) >> 6);
    char_array_4[2] = ((char_array_3[1] & 0x03) << 4) + ((char_array_3[2] & 0xf0) >> 4);
    char_array_4[3] = (char_array_3[2] & 0xfc) >> 2;

        if (isnan(char_array_4[1])) {
            char_array_4[2] = char_array_4[1] = 64;
        } else if (isnan(char_array_4[2])) {
            char_array_4[3] = 64;
        }

    for (j = 0; (j < i + 1); j++)
      ret += base64_chars[char_array_4[j]];

    while((i++ < 3))
      ret += '=';

  }

  return ret;

}

JS Code:

var Base64 = {
        _keyStr: ".ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+=",

    encode : function (input) {
        var output = [],
            chr1, chr2, chr3, enc1, enc2, enc3, enc4,
            i = 0;
        while (i < input.length) {
            chr1 = input[i++];
            chr2 = input[i++];
            chr3 = input[i++];

            enc1 = chr1 & 0x3f;
            enc2 = (chr1 >> 6) | ((chr2 & 0xf) << 2);
            enc3 = (chr2 >> 4) | ((chr3 & 0x3) << 4);
            enc4 = chr3 >> 2;

            if (isNaN(chr2)) {
                enc3 = enc4 = 64;
            } else if (isNaN(chr3)) {
                enc4 = 64;
            }

            output.push([this._keyStr.charAt(enc1),
                         this._keyStr.charAt(enc2),
                         this._keyStr.charAt(enc3),
                         this._keyStr.charAt(enc4)].join(''));
        }

        return output.join('');
    },

    decodeAsArray: function (b) {
        var d = this.decode(b),
            a = [],
            c;
                //alert("decoded base64:" + d);
        for (c = 0; c < d.length; c++) {
            a[c] = d.charCodeAt(c)
        }
                //alert("returning a");
        return a
    },

    decode: function( input ) {
        var output = "";
        var chr1, chr2, chr3 = "";
        var enc1, enc2, enc3, enc4 = "";
        var i = 0;

        do {
            enc1 = this._keyStr.indexOf(input.charAt(i++)) ;
            enc2 = this._keyStr.indexOf(input.charAt(i++)) ;
            enc3 = this._keyStr.indexOf(input.charAt(i++)) ;
            enc4 = this._keyStr.indexOf(input.charAt(i++)) ;

            chr1 = (enc1 | ((enc2 & 3) << 6));
            chr2 = (enc2 >> 2) | ((enc3 & 0x0F) << 4);
            chr3 = (enc3 >> 4) | (enc4 << 2);

            output = output + String.fromCharCode(chr1);
            if (enc3 != 64) {
                output = output + String.fromCharCode(chr2);
                        }
            if (enc4 != 64) {
                output = output + String.fromCharCode(chr3);
            }
            chr1 = chr2 = chr3 = "";
            enc1 = enc2 = enc3 = enc4 = "";
        } while (i < input.length);

        return (output);
    }

};

So, when looking at your c++ code compared to the JavaScript the isnan() shall be as follow:

if (isnan(char_array_3[1])) { // char_array_3[1] = chr2
    char_array_4[2] = char_array_4[1] = 64; // char_array_4[2] = enc3 & char_array_4[1] = enc2
} else if (isnan(char_array_3[2])) { // char_array_3[2] = chr3
    char_array_4[3] = 64; // char_array_4[3] = enc2
}

But the main problem is that the isnan() function is only dedicated to floating-point value in C++ and doesn't have the same meaning as in JavaScript.

Instead of using that isnan() function, replace the following part of base64_encode() :

  while (in_len--) {
    char_array_3[i++] = *(bytes_to_encode++);
    if (i == 3) {
    ...
    if (isnan(char_array_4[1])) {
        char_array_4[2] = char_array_4[1] = 64;
    } else if (isnan(char_array_4[2])) {
        char_array_4[3] = 64;
    }
    ...
    for(i = 0; (i < 4) ; i++)
       ret += base64_chars[char_array_4[i]];
    i = 0;
  }

By the following one:

1- to prevent unexpected value when less than 3 bytes left in the input buffer, force them to 0x00. Also before every next loop ( for(j=0;j<3;j++) char_array_3[j]=0x00; ).

2- when 2 bytes have been loaded from the input buffer if (i == 2) , the last item of the output buffer is set to 64.

3- when only 1 byte has been loaded from the input buffer if (i == 1) , the 2 last items of output buffer are set to 64.

  for(j=0;j<3;j++) char_array_3[j]=0x00; // initialize input array
  while (in_len--) {
    char_array_3[i++] = *(bytes_to_encode++);
    if ((i == 3) || (in_len == 0)) { // encode when 3 bytes or end of buffer
    ...
    if (i == 1) { // instead of (isnan(char_array_4[1]))
        // both char_array_3[1] and char_array_3[2] are not defined
        char_array_4[3] = char_array_4[2] = 64;
    } else if (i == 2) { // instead of (isnan(char_array_4[2]))
        // char_array_3[2] is not defined
        char_array_4[3] = 64;
    }
    ...
    for(i = 0; (i < 4) ; i++)
       ret += base64_chars[char_array_4[i]];
    i = 0;
    for(j=0;j<3;j++) char_array_3[j]=0x00; // initialize input array
  }

The last error in the base64_encode() function in C++ compare to the JavaScript are in the output buffer computation for the intermediate items. Instead of those following assignments:

  char_array_4[0] = char_array_3[0] & 0x3f;
  char_array_4[1] = ((char_array_3[0] & 0x0f) << 2) + ((char_array_3[1] & 0xc0) >> 6); // NOK
  char_array_4[2] = ((char_array_3[1] & 0x03) << 4) + ((char_array_3[2] & 0xf0) >> 4); // NOK
  char_array_4[3] = (char_array_3[2] & 0xfc) >> 2;

Use the following ones:

1- When calculating char_array_4[1] (= enc2 in JS), the entries char_array_3[0] (= chr1 in JS) and char_array_3[1] (= chr2 in JS) are reversed.

2- When calculating char_array_4[2] (= enc3 in JS), the entries char_array_3[1] (= chr2 in JS) and char_array_3[2] (= chr3 in JS) are reversed.

  // JS =>  enc1 = chr1 & 0x3f;
  char_array_4[0] = (char_array_3[0] & 0x3f); // OK
  // JS =>  enc2 = (chr1 >> 6) | ((chr2 & 0xf) << 2);
  char_array_4[1] = ((char_array_3[0] & 0xc0) >> 6) + ((char_array_3[1] & 0x0f) << 2); // OK
  // JS => enc3 = (chr2 >> 4) | ((chr3 & 0x3) << 4);
  char_array_4[2] = ((char_array_3[1] & 0xf0) >> 4) + ((char_array_3[2] & 0x03) << 4); // OK
  // JS => enc4 = chr3 >> 2;
  char_array_4[3] = (char_array_3[2] & 0xfc) >> 2;

After those corrections, the conditional block if (i) is useless and shall be removed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM