簡體   English   中英

讀取和base64編碼二進制文件

[英]Read and base64 encode a binary file

我正在嘗試從文件系統中讀取二進制文件,然后使用JavaScript對其進行base64編碼。 我正在使用FileReader API來讀取數據和此處的base64編碼器。

我的代碼似乎接近工作,問題是生成的base64數據是錯誤的。 這是我到目前為止所得到的:

function saveResource() {
    var file = $(".resourceFile")[0].files[0];

    var reader = new FileReader();
    reader.onload = function(evt) {
        var fileData = evt.target.result;
        var bytes = new Uint8Array(fileData);
        var binaryText = '';

        for (var index = 0; index < bytes.byteLength; index++) {
            binaryText += String.fromCharCode( bytes[index] );
        }

        console.log(Base64.encode(binaryText));

    };
    reader.readAsArrayBuffer(file);
};

這是我正在測試的文件(它是100x100藍色方塊):

在此輸入圖像描述

根據在線base64解碼器/編碼器 ,該文件應編碼為:

/ 9J / 4AAQSkZJRgABAgAAAQABAAD / 2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL / 2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL / wAARCABkAGQDASIAAhEBAxEB / 8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL / 8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4 + Tl5ufo6erx8vP09fb3 + PN6 / 8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL / 8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3 + PN6 / 9oADAMBAAIRAxEAPwDxyiiiv3E8wKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoooo AKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooA //ž

...而是我從JavaScript中獲得的是:

W7 / DmMO / w6AAEEpGSUYAAQIAAAEAAQAAw7 / DmwBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDLDv8ObAEMBCQkJDAsMGA0NGDIhHCEyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMsO / w4AAEQgAZABkAwEiAAIRAQMRAcO / w4QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoLw7 / DhADCtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDLCgcKRwqEII0LCscOBFVLDkcOwJDNicsKCCQoWFxgZGiUmJygpKjQ1Njc4OTpDREVGR0hJSlNUVVZXWFlaY2RlZmdoaWpzdHV2d3h5esKDwoTChcKGwofCiMKJworCksKTwpTClcKWwpfCmMKZwprCosKjwqTCpcKmwqfCqMKpwqrCssKzwrTCtcK2wrfCuMK5wrrDgsODw4TDhcOGw4fDiMOJw4rDksOTw5TDlcOWw5fDmMOZw5rDocOiw6PDpMOlw6bDp8Oow6nDqsOxw7LDs8O0w7XDtsO3w7jDucO6w7 / DhAAfAQADAQEBAQEBAQEBAAAAAAAAAQIDBAUGBwgJCgvDv8OEAMK1EQACAQIEBAMEBwUEBAABAncAAQIDEQQFITEGEkFRB2FxEyIywoEIFELCkcKhwrHDgQkjM1LDsBVicsORChYkNMOhJcOxFxgZGiYnKCkqNTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXrCgsKDwoTChcKGwofCiMKJworCksKTwpTClcKWwpfCmMKZwprCosKjwqTCpcKmwqfCqMKpwqrCssKzwrTCtcK2wrfCuMK5wrrDgsODw4TDhcOGw4fDiMOJw4rDksOTw5TDlcOWw5fD mMOZw5rDosOjw6TDpcOmw6fDqMOpw6rDssOzw7TDtcO2w7fDuMO5w7rDv8OaAAwDAQACEQMRAD8Aw7HDiijCosK / cTzDgMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooA8O / w5k =

如果我不得不冒險猜測我會說這個問題與二進制數據中的非打印字符有關(如果我對明文文檔進行編碼,則可以正常工作)。 但解決這個問題的最佳方法是什么?

編輯

看起來這可能是base64庫本身的一個問題(如果不是這樣,那么將Uint8Array如何解Uint8Array到用於庫調用的字符串中) 如果我改為使用瀏覽器的btoa()函數,並直接將它傳遞給Uint8Array binaryText ,那就行了。 太糟糕了,所有瀏覽器都不存在該功能。

和谷歌一起救援。 我找到了以下代碼,它將輸入數據作為“字節”的普通數組(0到255之間的數字,包括在內;如果Uint8Array直接傳遞給它,也可以正常工作),並將其添加到我正在使用的庫中:

//note:  it is assumed that the Base64 object has already been defined
//License:  Apache 2.0
Base64.byteToCharMap_ = null;
Base64.charToByteMap_ = null;
Base64.byteToCharMapWebSafe_ = null;
Base64.charToByteMapWebSafe_ = null;
Base64.ENCODED_VALS_BASE =
    'ABCDEFGHIJKLMNOPQRSTUVWXYZ' +
    'abcdefghijklmnopqrstuvwxyz' +
    '0123456789';

/**
 * Our default alphabet. Value 64 (=) is special; it means "nothing."
 * @type {string}
 */
Base64.ENCODED_VALS = Base64.ENCODED_VALS_BASE + '+/=';
Base64.ENCODED_VALS_WEBSAFE = Base64.ENCODED_VALS_BASE + '-_.';

/**
 * Base64-encode an array of bytes.
 *
 * @param {Array.<number>|Uint8Array} input An array of bytes (numbers with
 *     value in [0, 255]) to encode.
 * @param {boolean=} opt_webSafe Boolean indicating we should use the
 *     alternative alphabet.
 * @return {string} The base64 encoded string.
 */
Base64.encodeByteArray = function(input, opt_webSafe) {
  Base64.init_();

  var byteToCharMap = opt_webSafe ?
                      Base64.byteToCharMapWebSafe_ :
                      Base64.byteToCharMap_;

  var output = [];

  for (var i = 0; i < input.length; i += 3) {
    var byte1 = input[i];
    var haveByte2 = i + 1 < input.length;
    var byte2 = haveByte2 ? input[i + 1] : 0;
    var haveByte3 = i + 2 < input.length;
    var byte3 = haveByte3 ? input[i + 2] : 0;

    var outByte1 = byte1 >> 2;
    var outByte2 = ((byte1 & 0x03) << 4) | (byte2 >> 4);
    var outByte3 = ((byte2 & 0x0F) << 2) | (byte3 >> 6);
    var outByte4 = byte3 & 0x3F;

    if (!haveByte3) {
      outByte4 = 64;

      if (!haveByte2) {
        outByte3 = 64;
      }
    }

    output.push(byteToCharMap[outByte1],
                byteToCharMap[outByte2],
                byteToCharMap[outByte3],
                byteToCharMap[outByte4]);
  }

  return output.join('');
};

/**
 * Lazy static initialization function. Called before
 * accessing any of the static map variables.
 * @private
 */
Base64.init_ = function() {
  if (!Base64.byteToCharMap_) {
    Base64.byteToCharMap_ = {};
    Base64.charToByteMap_ = {};
    Base64.byteToCharMapWebSafe_ = {};
    Base64.charToByteMapWebSafe_ = {};

    // We want quick mappings back and forth, so we precompute two maps.
    for (var i = 0; i < Base64.ENCODED_VALS.length; i++) {
      Base64.byteToCharMap_[i] =
          Base64.ENCODED_VALS.charAt(i);
      Base64.charToByteMap_[Base64.byteToCharMap_[i]] = i;
      Base64.byteToCharMapWebSafe_[i] =
          Base64.ENCODED_VALS_WEBSAFE.charAt(i);
      Base64.charToByteMapWebSafe_[
          Base64.byteToCharMapWebSafe_[i]] = i;
    }
  }
};

對於含有上述功能庫中的完整的代碼可在這里 ,但在其未經修飾形式似乎取決於許多其他的庫。 上面稍微被黑客攻擊的版本應該適用於只需要快速解決此問題的任何人。

將二進制視為arraybuffer,這與任何字符編碼無關。 您的藍色方塊(.jpg)有361個本機字節,表示從0..255(十進制)的八位字節,它們不是字符!

這意味着:使用ArrayBuffer將其編碼為具有眾所周知的base64算法的Base64。

使用Perl返回原點,顯示如上所示的藍色方塊:

my $fh = IO::File->new;
$fh->open("d:/tmp/x.jpg", O_BINARY|O_CREAT|O_RDWR|O_TRUNC) or die $!;

$fh->print(decode_base64("/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBD
AQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCABkAGQDASIAAhEBAxEB/8QAFQABAQAA
AAAAAAAAAAAAAAAAAAf/xAAUEAEAAAAAAAAAAAAAAAAAAAAA/8QAFgEBAQEAAAAAAAAAAAAAAAAAAAUH/8QAFBEBAAAAAAAAAAAAAAAAAAAAAP/aAAwDAQACEQMR
AD8AjgDcUwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB//2Q==
"));


$fh->close;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM