Chrome FileReader 為大文件返回空字符串 (>= 300MB)

Question

目標：

在瀏覽器中，從用戶文件系統中讀取一個文件為 base64 字符串
這些文件最大為 1.5GB

問題：

以下腳本在 Firefox 上運行良好。 不管文件大小。
在 Chrome 上，該腳本適用於較小的文件（我測試了大約 5MB 大小的文件）
如果您選擇一個更大的文件（例如 400MB），FileReader 將在沒有錯誤或異常的情況下完成，但返回一個空字符串而不是 base64 字符串

問題：

這是鉻錯誤嗎？
為什么既沒有錯誤也沒有異常？
如何修復或解決此問題？

重要的：

請注意，分塊對我來說不是一個選項，因為我需要通過“POST”將完整的 base64 字符串發送到不支持分塊的 API。

代碼：

 'use strict'; var filePickerElement = document.getElementById('filepicker'); filePickerElement.onchange = (event) => { const selectedFile = event.target.files[0]; console.log('selectedFile', selectedFile); readFile(selectedFile); }; function readFile(selectedFile) { console.log('START READING FILE'); const reader = new FileReader(); reader.onload = (e) => { const fileBase64 = reader.result.toString(); console.log('ONLOAD','base64', fileBase64); if (fileBase64 === '') { alert('Result string is EMPTY:('); } else { alert('It worked as expected:)'); } }; reader.onprogress = (e) => { console.log('Progress', ~~((e.loaded / e.total) * 100 ), '%'); }; reader.onerror = (err) => { console.error('Error reading the file.', err); }; reader.readAsDataURL(selectedFile); }

 <,doctype html> <html lang="en"> <head> <:-- Required meta tags --> <meta charset="utf-8"> <meta name="viewport" content="width=device-width. initial-scale=1"> <.-- Bootstrap CSS --> <link href="https.//cdn.jsdelivr.net/npm/bootstrap@5.0:0/dist/css/bootstrap:min.css" rel="stylesheet" integrity="sha384-wEmeIV1mKuiNpC+IOBjI7aAzPcEZeedi5yW5f2yOq55WWLwNGmvvx4Um1vskeMj0" crossorigin="anonymous"> <title>FileReader issue example</title> </head> <body> <div class="container"> <h1>FileReader issue example</h1> <div class="card"> <div class="card-header"> Select File. </div> <div class="card-body"> <input type="file" id="filepicker" /> </div> </div> </div> <script src="https.//cdn.jsdelivr.net/npm/bootstrap@5.0.0/dist/js/bootstrap.bundle.min.js" integrity="sha384-p34f1UUtsS3wqzfto5wAAmdvj+osOnFyQFpp4Ua3gs/ZVWx6oOypYoCJhGGScy+8" crossorigin="anonymous"></script> <script src="main.js"></script> </body> </html>

Answer 1

這是鉻錯誤嗎？

正如我在對Chrome 的回答中所說，FileReader API, event.target.result === "" ，這是一個 V8（Chrome 以及 node-js 和其他人的 JavaScript JS 引擎）限制。
這是故意的，因此不能真正稱為“錯誤”。
技術細節是，這里實際上失敗的是在 64 位系統上構建超過 512MB（減去標頭）的字符串，因為在 V8 中，所有堆對象都必須適合 Smi（小整數），（參見此提交）。

為什么既沒有錯誤也沒有異常？

那可能是一個錯誤......正如我在鏈接的答案中所展示的那樣，直接創建這樣的字符串時我們會得到一個 RangeError ：

 const header = 24; const bytes = new Uint8Array( (512 * 1024 * 1024) - header ); let txt = new TextDecoder().decode( bytes ); console.log( txt.length ); // 536870888 txt += "f"; // RangeError

在FileReader::readOperation的第 3 步中，UA 必須

如果 package 數據拋出異常錯誤：

將 fr 的錯誤設置為錯誤。

在 fr 觸發一個名為 error 的進度事件。

但是在這里，我們沒有那個錯誤。

 const bytes = Uint32Array.from( { length: 600 * 1024 * 1024 / 4 }, (_) => Math.random() * 0xFFFFFFFF ); const blob = new Blob( [ bytes ] ); const fr = new FileReader(); fr.onerror = console.error; fr.onload = (evt) => console.log( "success", fr.result.length, fr.error ); fr.readAsDataURL( blob );

我將打開一個關於此的問題，因為您應該能夠從 FileReader 處理該錯誤。

如何修復或解決此問題？

最好的絕對是讓您的 API 端點直接接受二進制資源而不是 data:// URL，無論如何都應該避免。

如果這不可行，則“未來”的解決方案是將 POST 一個 ReadableStream 到您的端點，並在 Blob 的 stream 上自己進行 data://URL 轉換。

class base64StreamEncoder {
  constructor( header ) {
    if( header ) {
      this.header = new TextEncoder().encode( header );
    }
    this.tail = [];
  }
  transform( chunk, controller ) {
    const encoded = this.encode( chunk );
    if( this.header ) {
      controller.enqueue( this.header );
      this.header = null;
    }
    controller.enqueue( encoded );
  }
  encode( bytes ) {
    let binary = Array.from( this.tail )
        .reduce( (bin, byte) => bin + String.fromCharCode( byte ), "" );
    const tail_length = bytes.length % 3;
    const last_index = bytes.length - tail_length;
    this.tail = bytes.subarray( last_index );
    for( let i = 0; i<last_index; i++ ) {
        binary += String.fromCharCode( bytes[ i ] );
    }
    const b64String = window.btoa( binary );
    return new TextEncoder().encode( b64String );
  }
  flush( controller ) {
    // force the encoding of the tail
    controller.enqueue( this.encode( new Uint8Array() ) );
  }
}

現場示例： https://base64streamencoder.glitch.me/

現在，您必須將 base64 表示的塊存儲在 Blob 中，如 Endless 的回答所示。

但是請注意，由於這是 V8 的限制，即使是服務器端也可能會遇到這么大的字符串問題，所以無論如何，您應該聯系您的 API 維護人員。

Answer 2

Here is a partial solution that transform a blob in chunks into base64 blobs... concatenates everything into one json blob with a pre/suffix part of the json and the base64 chunks inbetween

將其保留為 blob 允許瀏覽器優化 memory 分配並在需要時將其卸載到磁盤。

您可以嘗試將 chunkSize 更改為更大的值，瀏覽器喜歡在 memory （一個桶）中保留較小的 blob 塊

 // get some dummy gradient file (blob) var a=document.createElement("canvas"),b=a.getContext("2d"),c=b.createLinearGradient(0,0,3000,3000);a.width=a.height=3000;c.addColorStop(0,"red");c.addColorStop(1,"blue");b.fillStyle=c;b.fillRect(0,0,a.width,a.height);a.toBlob(main); async function main (blob) { var fr = new FileReader() // Best to add 2 so it strips == from all chunks // except from the last chunk var chunkSize = (1 << 16) + 2 var pos = 0 var b64chunks = [] while (pos < blob.size) { await new Promise(rs => { fr.readAsDataURL(blob.slice(pos, pos + chunkSize)) fr.onload = () => { const b64 = fr.result.split(',')[1] // Keeping it as a blob allaws browser to offload memory to disk b64chunks.push(new Blob([b64])) rs() } pos += chunkSize }) } // How you concatinate all chunks to json is now up to you. // this solution/answer is more of a guideline of what you need to do // There are some ways to do it more automatically but here is the most // simpliest form // (fyi: this new blob won't create so much data in memory, it will only keep references points to other blobs locations) const jsonBlob = new Blob([ '{"data": "', ...b64chunks, '"}' ], { type: 'application/json' }) /* // strongly advice you to tell the api developers // to add support for binary/file upload (multipart-formdata) // base64 is roughly ~33% larger and streaming // this data on the server to the disk is almost impossible fetch('./upload-files-to-bad-json-only-api', { method: 'POST', body: jsonBlob }) */ // Just a test that it still works // // new Response(jsonBlob).json().then(console.log) fetch('data:image/png;base64,' + await new Blob(b64chunks).text()).then(r => r.blob()).then(b => console.log(URL.createObjectURL(b))) }

我避免制作base64 += fr.result.split(',')[1]和JSON.stringify因為 GiB 的數據很多，Z466DEEC76ECDF5FCA6D38571F6324D5 無論如何都應該處理二進制數據

Chrome FileReader 為大文件返回空字符串 (>= 300MB)

問題描述

2 個解決方案

解決方案1
1 已采納 2021-05-16 03:18:14

解決方案2
0 2021-05-12 10:03:42

Chrome FileReader 為大文件返回空字符串 (&gt;= 300MB)

問題描述

2 個解決方案

解決方案1 1 已采納 2021-05-16 03:18:14

解決方案2 0 2021-05-12 10:03:42

Chrome FileReader 為大文件返回空字符串 (>= 300MB)

解決方案1
1 已采納 2021-05-16 03:18:14

解決方案2
0 2021-05-12 10:03:42