axios 如何将 blob 与 arraybuffer 作为 responseType 处理？

Question

I'm downloading a zip file with axios .我正在下载带有axios的 zip 文件。 For further processing, I need to get the "raw" data that has been downloaded.为了进一步处理，我需要获取已下载的“原始”数据。 As far as I can see, in Javascript there are two types for this: Blobs and Arraybuffers.据我所知，在 Javascript 中有两种类型：Blob 和 Arraybuffers。 Both can be specified as responseType in the request options.两者都可以在请求选项中指定为responseType 。

In a next step, the zip file needs to be uncompressed.在下一步中，需要解压缩 zip 文件。 I've tried two libraries for this: js-zip and adm-zip.我为此尝试了两个库：js-zip 和 adm-zip。 Both want the data to be an ArrayBuffer.两者都希望数据是一个 ArrayBuffer。 So far so good, I can convert the blob to a buffer.到目前为止一切顺利，我可以将 blob 转换为缓冲区。 And after this conversion adm-zip always happily extracts the zip file.在此转换之后，adm-zip 总是很高兴地提取 zip 文件。 However, js-zip complains about a corrupted file, unless the zip has been downloaded with 'arraybuffer' as the axios responseType .但是，js-zip 会抱怨文件已损坏，除非已使用'arraybuffer'作为 axios responseType下载 zip。 js-zip does not work on a buffer that has been taken from a blob . js-zip 不适用于从blob获取的buffer 。

This was very confusing to me.这让我很困惑。 I thought both ArrayBuffer and Blob are essentially just views on the underlying memory.我认为ArrayBuffer和Blob本质上都只是对底层内存的看法。 There might be a difference in performance between downloading something as a blob vs buffer.将某些内容下载为 blob 与缓冲区之间可能存在性能差异。 But the resulting data should be the same, right ?但是结果数据应该是一样的吧？

Well, I decided to experiment and found this:好吧，我决定进行实验并发现：

If you specify responseType: 'blob' , axios converts the response.data to a string.如果指定responseType: 'blob' ，axios 会将response.data转换为字符串。 Let's say you hash this string and get hashcode A. Then you convert it to a buffer.假设您对该字符串进行哈希处理并获得哈希码 A。然后将其转换为缓冲区。 For this conversion, you need to specify an encoding.对于这种转换，您需要指定一种编码。 Depending on the encoding, you will get a variety of new hashes, let's call them B1, B2, B3, ... When specifying 'utf8' as the encoding, I get back to the original hash A.根据编码的不同，您将获得各种新的哈希值，我们称它们为 B1、B2、B3，...当指定 'utf8' 作为编码时，我将返回原始哈希值 A。

So I guess when downloading data as a 'blob' , axios implicitly converts it to a string encoded with utf8.所以我猜当将数据下载为'blob' ，axios 将其隐式转换为用 utf8 编码的字符串。 This seems very reasonable.这似乎非常合理。

Now you specify responseType: 'arraybuffer' .现在您指定responseType: 'arraybuffer' 。 Axios provides you with a buffer as response.data . Axios 为您提供了一个缓冲区作为response.data 。 Hash the buffer and you get a hashcode C. This code does not correspond to any code in A, B1, B2, ...对缓冲区进行哈希处理，您会得到一个哈希码 C。此代码与 A、B1、B2 中的任何代码都不对应，...

So when downloading data as an 'arraybuffer' , you get entirely different data?所以当下载数据作为'arraybuffer' ，你得到完全不同的数据？

It now makes sense to me that the unzipping library js-zip complains if the data is downloaded as a 'blob' .现在对我来说，解压缩库 js-zip 会抱怨数据是否作为'blob'下载是有意义'blob' 。 It probably actually is corrupted somehow.它可能实际上以某种方式损坏了。 But then how is adm-zip able to extract it?但是 adm-zip 是如何提取它的呢？ And I checked the extracted data, it is correct.我检查了提取的数据，它是正确的。 This might only be the case for this specific zip archive, but nevertheless surprises me.这可能只是这个特定 zip 存档的情况，但仍然让我感到惊讶。

Here is the sample code I used for my experiments:这是我用于实验的示例代码：

//typescript import syntax, this is executed in nodejs
import axios from 'axios';
import * as crypto from 'crypto';

axios.get(
    "http://localhost:5000/folder.zip", //hosted with serve
    { responseType: 'blob' }) // replace this with 'arraybuffer' and response.data will be a buffer
    .then((response) => {
        console.log(typeof (response.data));

        // first hash the response itself
        console.log(crypto.createHash('md5').update(response.data).digest('hex'));

        // then convert to a buffer and hash again
        // replace 'binary' with any valid encoding name
        let buffer = Buffer.from(response.data, 'binary');
        console.log(crypto.createHash('md5').update(buffer).digest('hex'));
        //...

What creates the difference here, and how do I get the 'true' downloaded data?是什么造成了这里的差异，我如何获得“真实”的下载数据？

Answer 1

From axios docs :来自axios 文档：

 // `responseType` indicates the type of data that the server will respond with // options are: 'arraybuffer', 'document', 'json', 'text', 'stream' // browser only: 'blob' responseType: 'json', // default

`'blob'` is a "browser only" option. `'blob'`是一个“仅限浏览器”的选项。

So from node.js, when you set responseType: "blob" , "json" will actually be used, which I guess fallbacks to "text" when no parse-able JSON data has been fetched.因此，从 node.js 开始，当您设置responseType: "blob" ，实际上会使用"json" ，我猜当没有获取可解析的 JSON 数据时，它会回退到"text" 。

Fetching binary data as text is prone to generate corrupted data.以文本形式获取二进制数据很容易产生损坏的数据。 Because the text returned by Body.text() and many other APIs are USVStrings (they don't allow unpaired surrogate codepoints ) and because the response is decoded as UTF-8, some bytes from the binary file can't be mapped to characters correctly and will thus be replaced by (U+FFDD) replacement character, with no way to get back what that data was before: your data is corrupted.因为Body.text()和许多其他 API 返回的文本是USVStrings （它们不允许不成对的代理代码点）并且因为响应被解码为 UTF-8，所以二进制文件中的某些字节无法映射到字符正确，因此将被替换为 (U+FFDD) 替换字符，无法恢复之前的数据：您的数据已损坏。

Here is a snippet explaining this, using the header of a .png file 0x89 0x50 0x4E 0x47 as an example.这是解释这一点的片段，以 .png 文件0x89 0x50 0x4E 0x47的标题为例。

 (async () => { const url = 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png'; // fetch as binary const buffer = await fetch( url ).then(resp => resp.arrayBuffer()); const header = new Uint8Array( buffer ).slice( 0, 4 ); console.log( 'binary header', header ); // [ 137, 80, 78, 61 ] console.log( 'entity encoded', entityEncode( header ) ); // [ "U+0089", "U+0050", "U+004E", "U+0047" ] // You can read more about (U+0089) character here // https://www.fileformat.info/info/unicode/char/0089/index.htm // You can see in the left table how this character in UTF-8 needs two bytes (0xC2 0x89) // We thus can't map this character correctly in UTF-8 from the UTF-16 codePoint, // it will get discarded by the parser and converted to the replacement character // read as UTF-8 const utf8_str = await new Blob( [ header ] ).text(); console.log( 'read as UTF-8', utf8_str ); // " PNG" // build back a binary array from that string const utf8_binary = [ ...utf8_str ].map( char => char.charCodeAt( 0 ) ); console.log( 'Which is binary', utf8_binary ); // [ 65533, 80, 78, 61 ] console.log( 'entity encoded', entityEncode( utf8_binary ) ); // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ] // You can read more about character   (U+FFDD) here // https://www.fileformat.info/info/unicode/char/0fffd/index.htm // // P (U+0050), N (U+004E) and G (U+0047) characters are compatible between UTF-8 and UTF-16 // For these there is no encoding lost // (that's how base64 encoding makes it possible to send binary data as text) // now let's see what fetching as text holds const fetched_as_text = await fetch( url ).then( resp => resp.text() ); const header_as_text = fetched_as_text.slice( 0, 4 ); console.log( 'fetched as "text"', header_as_text ); // " PNG" const as_text_binary = [ ...header_as_text ].map( char => char.charCodeAt( 0 ) ); console.log( 'Which is binary', as_text_binary ); // [ 65533, 80, 78, 61 ] console.log( 'entity encoded', entityEncode( as_text_binary ) ); // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ] // It's been read as UTF-8, we lost the first byte. })(); function entityEncode( arr ) { return Array.from( arr ).map( val => 'U+' + toHex( val ) ); } function toHex( num ) { return num.toString( 16 ).padStart(4, '0').toUpperCase(); }

There is natively no Blob object in node.js, so it makes sense axios didn't monkey-patch it just so they can return a response no-one else would be able to consume anyway. node.js 中本来就没有 Blob 对象，所以 axios 没有对它进行猴子修补是有道理的，这样它们就可以返回一个其他人无论如何都无法使用的响应。

From a browser, you'd have exactly the same responses:在浏览器中，您会得到完全相同的响应：

 function fetchAs( type ) { return axios( { method: 'get', url: 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png', responseType: type } ); } function loadImage( data, type ) { // we can all pass them to the Blob constructor directly const new_blob = new Blob( [ data ], { type: 'image/jpg' } ); // with blob: URI, the browser will try to load 'data' as-is const url = URL.createObjectURL( new_blob ); img = document.getElementById( type + '_img' ); img.src = url; return new Promise( (res, rej) => { img.onload = e => res(img); img.onerror = rej; } ); } [ 'json', // will fail 'text', // will fail 'arraybuffer', 'blob' ].forEach( type => fetchAs( type ) .then( resp => loadImage( resp.data, type ) ) .then( img => console.log( type, 'loaded' ) ) .catch( err => console.error( type, 'failed' ) ) );

 <script src="https://unpkg.com/axios/dist/axios.min.js"></script> <figure> <figcaption>json</figcaption> <img id="json_img"> </figure> <figure> <figcaption>text</figcaption> <img id="text_img"> </figure> <figure> <figcaption>arraybuffer</figcaption> <img id="arraybuffer_img"> </figure> <figure> <figcaption>blob</figcaption> <img id="blob_img"> </figure>

axios 如何将 blob 与 arraybuffer 作为 responseType 处理？

问题描述

1 个解决方案

解决方案1
34 已采纳 2020-02-29 03:49:47

`'blob'` is a "browser only" option. `'blob'`是一个“仅限浏览器”的选项。

axios 如何将 blob 与 arraybuffer 作为 responseType 处理？

问题描述

1 个解决方案

解决方案1 34 已采纳 2020-02-29 03:49:47

'blob' is a "browser only" option. 'blob'是一个“仅限浏览器”的选项。

解决方案1
34 已采纳 2020-02-29 03:49:47

`'blob'` is a "browser only" option. `'blob'`是一个“仅限浏览器”的选项。