如何在浏览器 JavaScript 中将 stream 文件与计算机进行往来？

Question

I am working on a web app (pure HTML/Javascript, no libraries) that does byte-level processing of a file (Huffman Encoding demo).我正在开发一个 web 应用程序（纯 HTML/Javascript，没有库），它对文件进行字节级处理（霍夫曼编码演示）。 It works beautifully (you do NOT want to know how long it took to get there), but my sense of completion is bothering me just a bit because I have to load the files to and from an ArrayBuffer instead of streaming from the HDD.它工作得很好（你不想知道到达那里需要多长时间），但我的完成感让我有点困扰，因为我必须将文件加载到 ArrayBuffer 中，而不是从 HDD 流式传输。 There's also a filesize limitation, although it would admittedly take quite a long time to compress a 4GB file (the maximum that my data structures support).还有一个文件大小限制，尽管压缩一个 4GB 的文件（我的数据结构支持的最大值）确实需要很长时间。

Still, in the interest of making this app work on low-resource devices, how might I stream a file from a file input box (I need multiple passes for the frequency counting, filesize detection, and actual write) and to a browser download of some sort (that's in one pass at least, thankfully)?尽管如此，为了让这个应用程序在低资源设备上运行，我如何从文件input框（我需要多次通过频率计数、文件大小检测和实际写入）和浏览器下载文件某种（至少是一次，谢天谢地）？

Here are the relevant functions that handle it right now (I apologize for the globals:P):以下是现在处理它的相关函数（我为全局变量道歉：P）：

//Load the file
  function startProcessingFile(){ //Loads the file and sets up a callback to start the main process when done.
    var ff=document.getElementById("file");//I am assuming that you don't need to see the HTML here. :D
    if (ff.files.length === 0) {
      displayError("No file selected");
    }
    else{
      displayStatus("Loading File...");
      var fr = new FileReader;
      fr.onload=function () {inp = new DataView(fr.result); boot();}
      fr.onerror=function () {displayError(fr.error)};
      fr.readAsArrayBuffer(ff.files[0]);
    }
  }

//A bit later on -- one of the functions that reads the data from the input file
function countTypes(c){ //counts the frequencies. c is # bytes processed.
  if (die){
    die=false;
    return;
  }
  var i=Math.ceil(inputSize/100.0);
  while (c<inputSize && i>0){
    var d=inp.getUint8(c);
    frequencies[d]=frequencies[d]+1;
    i--;
    c++;//Accidental, but funny.
  }
  var perc=100.0*c/inputSize;
  updateProgress(perc);
  if (c<inputSize){
    setTimeout(function () {countTypes(c);}, 0);
  }
  else{
    updateProgress(100);
    system_state++;
    taskHandle();
  }
}

//Here's where the file is read the last time and also where the bits come from that I want to save. If I could stream the data directly I could probably even get rid of the dry-run stage I currently need to count how many bytes to allocate for the output ArrayBuffer. I mean, Google Drive can download files without telling the browser the size, just whether it's done yet or not, so I'd assume that's a feature I could access here too. I'm just not sure how you actually gain access to a download from JS in the first place.
function encode(c,d){ //performs the Huffman encoding. 
//If d is true, does not actually write. c is # of bits processed so far.
  if (die){
    die=false;
    return;
  }
  var i=Math.ceil(inputSize/250.0);
  while (c<inputSize && i>0){
    var b=inp.getUint8(c);
    var seq;
    for (var j=0; j<table.length; j++){
      if (table[j].value===b){
        seq=table[j].code
      }
    }
    for (var j=0; j<seq.length; j++){
      writeBit(seq[j],d);
    }
    i--;
    c++;//Accidental, but funny.
  }
  var perc=100.0*c/inputSize;
  updateProgress(perc);
  if (c<inputSize){
    setTimeout(function () {encode(c,d);}, 0);
  }
  else{
    updateProgress(100);
    system_state++;
    taskHandle();
  }
}

//Finally, bit-level access for unaligned read/write so I can actually take advantage of the variable word size of the Huffman encoding (the read is used for decoding).
function readBit(){ //reads one bit (b) from the ArrayBuffer/DataView. The offset of 4 is for the filesize int.
  var data_byte=inp.getUint8(byte_index+4);
  var res=data_byte>>>bit_index;
  bit_index+=1;
  if (bit_index>7){
    bit_index=0;
    byte_index++;
  }
  return (res&1);
}

function writeBit(b,d){ //writes one bit (b) to the output Arraybuffer/Dataview. If d is true, does not actually write.
  if (d===false){ //i.e. not dry-run mode
    var bitmask=0xff;
    var flag=1<<bit_index;
    bitmask=bitmask^flag;
    current_byte=current_byte&bitmask;
    current_byte=current_byte|(b<<bit_index);
    output.setUint8(byte_index+4, current_byte);
  }
  bit_index+=1;
  if (bit_index>7){
    bit_index=0;
    byte_index++;
  }
}

function readByte(){ //reads a byte using readBit. Unaligned.
  var b=0;
  for (var i=0; i<8; i++){
    var t=readBit();
    b=b|(t<<i);
  }
  return b;
}

function writeByte(b,d){ //writes a byte using writeByte. Unaligned.
  for (var i=0; i<8; i++){
    var res=b>>>i;
    writeBit((res&1),d); 
  }
}

//And finally the download mechanism I'm using.
function downloadResult(){//download processed file with specified extension
  var blobObject = new Blob([output], {type: 'application/octet-stream'});
  var n=source_name.split('\\').pop().split('/').pop();
  if (doEncode){
    n=n+fext
  }else{
    n=n.replace(fext,"");
  }
  var a = document.createElement("a");
  a.setAttribute("href", URL.createObjectURL(blobObject));
  a.setAttribute("download", n);
  a.click();
  delete a;
  running=false;
  var b=document.getElementById("ac");
  if (b.classList.contains("activeNav")){
    clearRes();
  }
}

I basically want to rip most of that out and replace it with something that can read bytes or medium-ish chunks of data out of the file that the user selects, and then when it gets to the actual output stage, trickle that data byte-by-byte through a more-or-less vanilla download to their download folder.我基本上想撕掉大部分内容，并用可以从用户选择的文件中读取字节或中等数据块的东西替换它，然后当它到达实际的 output 阶段时，涓涓该数据字节-通过或多或少的香草下载到他们的下载文件夹中。

I do know that multiple files can be selected in a file input box, so perhaps if it's possible to download to a subfolder I could work out how to make an in-browser file archiver for the heck of it.我确实知道可以在文件输入框中选择多个文件，所以如果可以下载到子文件夹，我可能会想出如何制作一个浏览器内的文件存档器来解决这个问题。 Wouldn't that be fun.那岂不是很有趣。 ..,Mind, I'm fairly sure it's not possible (I don't see why you shouldn't be able to create a subdirectory in the browser downloads folder from the webpage. but there's probably a security reason). ..，请注意，我很确定这是不可能的（我不明白为什么您不能在网页的浏览器下载文件夹中创建子目录。但可能是出于安全原因）。

Let me know if you need to see more code, but as this is a class project I don't want to get accused of plagiarizing my own app...如果您需要查看更多代码，请告诉我，但由于这是一个 class 项目，我不想被指责抄袭我自己的应用程序...

Answer 1

To read from the disk as a stream从磁盘读取为 stream

you can use the Blob.stream() method which returns aReadableStream from that Blob (or File).您可以使用Blob.stream()方法从该 Blob（或文件）返回ReadableStream 。

 inp.onchange = async (evt) => { const stream = inp.files[ 0 ].stream(); const reader = stream.getReader(); while( true ) { const { done, value } = await reader.read(); if( done ) { break; } handleChunk( value ); } console.log( "all done" ); }; function handleChunk( buf ) { console.log( "received a new buffer", buf.byteLength ); }

 <input type="file" id="inp">

For older browsers that don't support this method, you can still read the File by chunks only using its .slice() method:对于不支持此方法的旧浏览器，您仍然可以仅使用其.slice()方法按块读取文件：

 inp.onchange = async (evt) => { const file = inp.files[ 0 ]; const chunksize = 64 * 1024; let offset = 0; while( offset < file.size ) { const chunkfile = await file.slice( offset, offset + chunksize ); // Blob.arrayBuffer() can be polyfilled with a FileReader const chunk = await chunkfile.arrayBuffer(); handleChunk( chunk ); offset += chunksize; } console.log( "all done" ); }; function handleChunk( buf ) { console.log( "received a new buffer", buf.byteLength ); }

 <input type="file" id="inp">

Writing to disk as stream however is a bit harder.但是，以 stream 的形式写入磁盘有点困难。

There is a great hack by Jimmy Wärting called StreamSaver.js which uses Service Workers. Jimmy Wärting 有一个很棒的 hack ，叫做StreamSaver.js ，它使用了 Service Worker。 I'm not sure how far its browser support goes by though, and while awesome, it's still an "hack" and requires a Service Worker to run.我不确定它的浏览器支持能走多远，虽然很棒，但它仍然是一个“黑客”，需要一个 Service Worker 才能运行。

An easier way to do so is to use the being defined File System API which is currently only available in Chrome.一种更简单的方法是使用正在定义的文件系统 API ，该文件系统目前仅在 Chrome 中可用。 You can see this Q/A for a code example.您可以查看此 Q/A以获取代码示例。

Answer 2

There is a streams API now already supported from the modern browsers in Javascript Javascript 中的现代浏览器现在已经支持流 API

Mozilla Streams MDN with samples带有示例的 Mozilla Streams MDN

// setup your stream with the options, it will help handle the size limitations etc.
var readableStream = new ReadableStream(underlyingSource[, queuingStrategy]);

fetch("https://www.example.org/").then((response) => {
  const reader = response.body.getReader();
  const stream = new ReadableStream({
    start(controller) {
      // The following function handles each data chunk
      function push() {
        // "done" is a Boolean and value a "Uint8Array"
        reader.read().then(({ done, value }) => {
          // Is there no more data to read?
          if (done) {
            // Tell the browser that we have finished sending data
            controller.close();
            return;
          }

          // Get the data and send it to the browser via the controller
          controller.enqueue(value);
          push();
        });
      };
      
      push();
    }
  });

  return new Response(stream, { headers: { "Content-Type": "text/html" } });
});

如何在浏览器 JavaScript 中将 stream 文件与计算机进行往来？

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-12-01 08:54:19

To read from the disk as a stream从磁盘读取为 stream

Writing to disk as stream however is a bit harder.但是，以 stream 的形式写入磁盘有点困难。

解决方案2
0 2020-12-01 07:37:36

如何在浏览器 JavaScript 中将 stream 文件与计算机进行往来？

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-12-01 08:54:19

To read from the disk as a stream从磁盘读取为 stream

Writing to disk as stream however is a bit harder.但是，以 stream 的形式写入磁盘有点困难。

解决方案2 0 2020-12-01 07:37:36

解决方案1
1 已采纳 2020-12-01 08:54:19

解决方案2
0 2020-12-01 07:37:36