简体   繁体   中英

How can I asynchronously decompress a gzip file in a JavaScript web worker?

I have a vanilla JS script running in-browser (Chrome, and only Chrome needs to be supported - if browser support is of any importance).

I want to offload a 15 MB gzip file to a web worker and unzip that file asynchronously, then return the uncompressed data back to the main thread, in order not to freeze the main application thread during the decompression procedure.

When unzipping in the main thread, I'm using the JSXCompressor library , and that works fine. However, as this library references the window object, which isn't accessible from a worker context, I can't use the same library injected into the worker code (running the decompression raises an exception on the first line of the library mentioning "window", saying it's undefined).

The same is true for other JS libraries I've managed to dig up in an afternoon of googling, like zlib or the more modern Pako . They all in one way or another seem to reference a DOM element, which raises exceptions when used in a web worker context.

So my question is - is anyone aware of a way I can pull this off, either by explaining to me what I seem to be getting wrong, through a hack, or by providing me with a link to a JS library that can function in this use case (I need only decompression, standard gzip)?

Edit: I'm also interested in any hack that can leverage built-in browser capabilities for ungzipping, as is done for HTTP requests.

Thanks a bunch.

I've authored a library fflate to accomplish exactly this task. It offers asynchronous versions of every compression/decompression method it supports, but rather than running in an event loop, the library delegates the processing to a separate thread. You don't need to manually create a worker or specify paths to the package's internal workers, since it generates them on-the-fly.

import { gunzip } from 'fflate';
// Let's suppose you got a File object (from, say, an input)
const reader = new FileReader();
reader.onloadend = () => {
  const typedArrayUncompressed = new Uint8Array(reader.result);
  gunzip(typedArrayUncompressed, (err, gzippedResult) => {
    // This is a Uint8Array
    console.log('Compressed output:', gzippedResult);
  });
}
reader.readAsArrayBuffer(fileObject);

Effectively, you need to convert the input format to a Uint8Array, then convert the output format to whatever you want to use. For instance, FileReader is the most cross-platform solution for files, fflate.strToU8 and fflate.strFromU8 work for string conversions.

PS This is actually still about as fast as the native CompressionStream solution from my tests, but will work in more browsers. If you want streaming support, use fflate's AsyncGunzip stream class.

There is a new web API Compression streams proposal, which is already implemented in Chrome and which does exactly this: asynchronously compress/decompress data.

It should support both deflate and gzip algorithms, and should use native implementations -> faster than any lib.

So in Chrome you can simply do:

 if( "CompressionStream" in window ) { (async () => { // To be able to pass gzipped data in stacksnippet we host it as a data URI // that we do convert to a Blob. // The original file is an utf-8 text "Hello world" // which is way bigger once compressed, but that's an other story;) const compressed_blob = await fetch("data:application/octet-stream;base64,H4sIAAAAAAAAE/NIzcnJVyjPL8pJAQBSntaLCwAAAA==").then((r) => r.blob()); const decompressor = new DecompressionStream("gzip"); const decompression_stream = compressed_blob.stream().pipeThrough(decompressor); const decompressed_blob = await new Response(decompression_stream).blob(); console.log("decompressed:", await decompressed_blob.text()); })().catch(console.error); } else { console.error("Your browser doesn't support the Compression API"); }

Obviously, this is also be available in Web Workers, but since the API is designed as entirely asynchronous, and making use of Streams, browsers should theoretically already be able to outsource all the hard work on an other thread on their own anyway.


Now, this is still a bit of a future solution, and for today you still might want to use a library instead.

However, we don't do library recommendations here, but I should note that I personally do use pako in Web Workers in a daily basis, with no problem and I don't see why a compression library would ever need the DOM, so I supsect you are doing something wrong™ .

 (async() => { const worker_script = ` importScripts("https://cdnjs.cloudflare.com/ajax/libs/pako/1.0.11/pako_inflate.min.js"); self.onmessage = async (evt) => { const file = evt.data; const buf = await file.arrayBuffer(); const decompressed = pako.inflate(buf); // zero copy self.postMessage(decompressed, [decompressed.buffer]); }; `; const worker_blob = new Blob([worker_script], { type: "application/javascript" }); const worker_url = URL.createObjectURL(worker_blob); const worker = new Worker(worker_url); const compressed_blob = await fetch("data:application/octet-stream;base64,H4sIAAAAAAAAE/NIzcnJVyjPL8pJAQBSntaLCwAAAA==").then((r) => r.blob()); worker.onmessage = ({ data }) => { console.log("received from worker:", new TextDecoder().decode(data)); }; worker.postMessage(compressed_blob); })().catch(console.error);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM