简体   繁体   English

如何在 JavaScript web worker 中异步解压缩 gzip 文件?

[英]How can I asynchronously decompress a gzip file in a JavaScript web worker?

I have a vanilla JS script running in-browser (Chrome, and only Chrome needs to be supported - if browser support is of any importance).我有一个在浏览器中运行的普通 JS 脚本(Chrome,并且只需要支持 Chrome——如果浏览器支持很重要的话)。

I want to offload a 15 MB gzip file to a web worker and unzip that file asynchronously, then return the uncompressed data back to the main thread, in order not to freeze the main application thread during the decompression procedure.我想将一个 15 MB 的 gzip 文件卸载到 web worker 并异步解压缩该文件,然后将未压缩的数据返回到主线程,以免在解压缩过程中冻结主应用程序线程。

When unzipping in the main thread, I'm using the JSXCompressor library , and that works fine.在主线程中解压缩时,我使用的是JSXCompressor library ,它工作正常。 However, as this library references the window object, which isn't accessible from a worker context, I can't use the same library injected into the worker code (running the decompression raises an exception on the first line of the library mentioning "window", saying it's undefined).然而,由于这个库引用了 window object,它不能从工作环境访问,我不能使用注入工作代码的同一个库(运行解压会在库的第一行引发异常,提到“window “,说它是未定义的)。

The same is true for other JS libraries I've managed to dig up in an afternoon of googling, like zlib or the more modern Pako .我在一个下午的谷歌搜索中设法挖掘出的其他 JS 库也是如此,比如 zlib 或更现代的Pako They all in one way or another seem to reference a DOM element, which raises exceptions when used in a web worker context.它们似乎都以某种方式引用 DOM 元素,当在 web worker 上下文中使用时会引发异常。

So my question is - is anyone aware of a way I can pull this off, either by explaining to me what I seem to be getting wrong, through a hack, or by providing me with a link to a JS library that can function in this use case (I need only decompression, standard gzip)?所以我的问题是 - 有没有人知道我可以解决这个问题的方法,要么通过黑客向我解释我似乎出错的地方,要么通过向我提供指向 function 的 JS 库的链接用例(我只需要解压,标准gzip)?

Edit: I'm also interested in any hack that can leverage built-in browser capabilities for ungzipping, as is done for HTTP requests.编辑:我也对任何可以利用内置浏览器功能进行解压缩的 hack 感兴趣,就像对 HTTP 请求所做的那样。

Thanks a bunch.非常感谢。

I've authored a library fflate to accomplish exactly this task.我编写了一个库fflate来完成这个任务。 It offers asynchronous versions of every compression/decompression method it supports, but rather than running in an event loop, the library delegates the processing to a separate thread.它提供了它支持的每种压缩/解压缩方法的异步版本,但不是在事件循环中运行,库将处理委托给一个单独的线程。 You don't need to manually create a worker or specify paths to the package's internal workers, since it generates them on-the-fly.您不需要手动创建工作人员或指定包内部工作人员的路径,因为它会即时生成它们。

import { gunzip } from 'fflate';
// Let's suppose you got a File object (from, say, an input)
const reader = new FileReader();
reader.onloadend = () => {
  const typedArrayUncompressed = new Uint8Array(reader.result);
  gunzip(typedArrayUncompressed, (err, gzippedResult) => {
    // This is a Uint8Array
    console.log('Compressed output:', gzippedResult);
  });
}
reader.readAsArrayBuffer(fileObject);

Effectively, you need to convert the input format to a Uint8Array, then convert the output format to whatever you want to use.实际上,您需要将输入格式转换为 Uint8Array,然后将 output 格式转换为您想要使用的任何格式。 For instance, FileReader is the most cross-platform solution for files, fflate.strToU8 and fflate.strFromU8 work for string conversions.例如, FileReader是最跨平台的文件解决方案, fflate.strToU8fflate.strFromU8用于字符串转换。

PS This is actually still about as fast as the native CompressionStream solution from my tests, but will work in more browsers. PS 这实际上仍然与我测试的原生CompressionStream解决方案一样快,但可以在更多浏览器中使用。 If you want streaming support, use fflate's AsyncGunzip stream class.如果你想要流媒体支持,使用fflate的AsyncGunzip stream class。

There is a new web API Compression streams proposal, which is already implemented in Chrome and which does exactly this: asynchronously compress/decompress data.有一个新的 web API Compression streams proposal,它已经在 Chrome 中实现了,它就是这样做的:异步压缩/解压缩数据。

It should support both deflate and gzip algorithms, and should use native implementations -> faster than any lib.它应该同时支持deflategzip算法,并且应该使用本机实现 -> 比任何库都快。

So in Chrome you can simply do:所以在 Chrome 中你可以简单地做:

 if( "CompressionStream" in window ) { (async () => { // To be able to pass gzipped data in stacksnippet we host it as a data URI // that we do convert to a Blob. // The original file is an utf-8 text "Hello world" // which is way bigger once compressed, but that's an other story;) const compressed_blob = await fetch("data:application/octet-stream;base64,H4sIAAAAAAAAE/NIzcnJVyjPL8pJAQBSntaLCwAAAA==").then((r) => r.blob()); const decompressor = new DecompressionStream("gzip"); const decompression_stream = compressed_blob.stream().pipeThrough(decompressor); const decompressed_blob = await new Response(decompression_stream).blob(); console.log("decompressed:", await decompressed_blob.text()); })().catch(console.error); } else { console.error("Your browser doesn't support the Compression API"); }

Obviously, this is also be available in Web Workers, but since the API is designed as entirely asynchronous, and making use of Streams, browsers should theoretically already be able to outsource all the hard work on an other thread on their own anyway.显然,这在 Web Workers 中也可用,但由于 API 被设计为完全异步的,并且使用了 Streams,理论上浏览器应该已经能够将所有艰苦的工作外包给自己的另一个线程。


Now, this is still a bit of a future solution, and for today you still might want to use a library instead.现在,这仍然是一个未来的解决方案,今天您可能仍想使用库。

However, we don't do library recommendations here, but I should note that I personally do use pako in Web Workers in a daily basis, with no problem and I don't see why a compression library would ever need the DOM, so I supsect you are doing something wrong™ .但是,我们不在这里推荐库,但我应该指出,我个人每天都在 Web Workers 中使用 pako,没有问题,而且我不明白为什么压缩库需要 DOM,所以我怀疑你做错了什么™

 (async() => { const worker_script = ` importScripts("https://cdnjs.cloudflare.com/ajax/libs/pako/1.0.11/pako_inflate.min.js"); self.onmessage = async (evt) => { const file = evt.data; const buf = await file.arrayBuffer(); const decompressed = pako.inflate(buf); // zero copy self.postMessage(decompressed, [decompressed.buffer]); }; `; const worker_blob = new Blob([worker_script], { type: "application/javascript" }); const worker_url = URL.createObjectURL(worker_blob); const worker = new Worker(worker_url); const compressed_blob = await fetch("data:application/octet-stream;base64,H4sIAAAAAAAAE/NIzcnJVyjPL8pJAQBSntaLCwAAAA==").then((r) => r.blob()); worker.onmessage = ({ data }) => { console.log("received from worker:", new TextDecoder().decode(data)); }; worker.postMessage(compressed_blob); })().catch(console.error);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM