简体   繁体   English

axios 如何将 blob 与 arraybuffer 作为 responseType 处理?

[英]how does axios handle blob vs arraybuffer as responseType?

I'm downloading a zip file with axios .我正在下载带有axios的 zip 文件。 For further processing, I need to get the "raw" data that has been downloaded.为了进一步处理,我需要获取已下载的“原始”数据。 As far as I can see, in Javascript there are two types for this: Blobs and Arraybuffers.据我所知,在 Javascript 中有两种类型:Blob 和 Arraybuffers。 Both can be specified as responseType in the request options.两者都可以在请求选项中指定为responseType

In a next step, the zip file needs to be uncompressed.在下一步中,需要解压缩 zip 文件。 I've tried two libraries for this: js-zip and adm-zip.我为此尝试了两个库:js-zip 和 adm-zip。 Both want the data to be an ArrayBuffer.两者都希望数据是一个 ArrayBuffer。 So far so good, I can convert the blob to a buffer.到目前为止一切顺利,我可以将 blob 转换为缓冲区。 And after this conversion adm-zip always happily extracts the zip file.在此转换之后,adm-zip 总是很高兴地提取 zip 文件。 However, js-zip complains about a corrupted file, unless the zip has been downloaded with 'arraybuffer' as the axios responseType .但是,js-zip 会抱怨文件已损坏,除非已使用'arraybuffer'作为 axios responseType下载 zip。 js-zip does not work on a buffer that has been taken from a blob . js-zip 不适用于从blob获取的buffer

This was very confusing to me.这让我很困惑。 I thought both ArrayBuffer and Blob are essentially just views on the underlying memory.我认为ArrayBufferBlob本质上都只是对底层内存的看法。 There might be a difference in performance between downloading something as a blob vs buffer.将某些内容下载为 blob 与缓冲区之间可能存在性能差异。 But the resulting data should be the same, right ?但是结果数据应该是一样的吧?

Well, I decided to experiment and found this:好吧,我决定进行实验并发现:

If you specify responseType: 'blob' , axios converts the response.data to a string.如果指定responseType: 'blob' ,axios 会将response.data转换为字符串。 Let's say you hash this string and get hashcode A. Then you convert it to a buffer.假设您对该字符串进行哈希处理并获得哈希码 A。然后将其转换为缓冲区。 For this conversion, you need to specify an encoding.对于这种转换,您需要指定一种编码。 Depending on the encoding, you will get a variety of new hashes, let's call them B1, B2, B3, ... When specifying 'utf8' as the encoding, I get back to the original hash A.根据编码的不同,您将获得各种新的哈希值,我们称它们为 B1、B2、B3,...当指定 'utf8' 作为编码时,我将返回原始哈希值 A。

So I guess when downloading data as a 'blob' , axios implicitly converts it to a string encoded with utf8.所以我猜当将数据下载为'blob' ,axios 将其隐式转换为用 utf8 编码的字符串。 This seems very reasonable.这似乎非常合理。

Now you specify responseType: 'arraybuffer' .现在您指定responseType: 'arraybuffer' Axios provides you with a buffer as response.data . Axios 为您提供了一个缓冲区作为response.data Hash the buffer and you get a hashcode C. This code does not correspond to any code in A, B1, B2, ...对缓冲区进行哈希处理,您会得到一个哈希码 C。此代码与 A、B1、B2 中的任何代码都不对应,...

So when downloading data as an 'arraybuffer' , you get entirely different data?所以当下载数据作为'arraybuffer' ,你得到完全不同的数据?

It now makes sense to me that the unzipping library js-zip complains if the data is downloaded as a 'blob' .现在对我来说,解压缩库 js-zip 会抱怨数据是否作为'blob'下载是有意义'blob' It probably actually is corrupted somehow.它可能实际上以某种方式损坏了。 But then how is adm-zip able to extract it?但是 adm-zip 是如何提取它的呢? And I checked the extracted data, it is correct.我检查了提取的数据,它是正确的。 This might only be the case for this specific zip archive, but nevertheless surprises me.这可能只是这个特定 zip 存档的情况,但仍然让我感到惊讶。

Here is the sample code I used for my experiments:这是我用于实验的示例代码:

//typescript import syntax, this is executed in nodejs
import axios from 'axios';
import * as crypto from 'crypto';

axios.get(
    "http://localhost:5000/folder.zip", //hosted with serve
    { responseType: 'blob' }) // replace this with 'arraybuffer' and response.data will be a buffer
    .then((response) => {
        console.log(typeof (response.data));

        // first hash the response itself
        console.log(crypto.createHash('md5').update(response.data).digest('hex'));

        // then convert to a buffer and hash again
        // replace 'binary' with any valid encoding name
        let buffer = Buffer.from(response.data, 'binary');
        console.log(crypto.createHash('md5').update(buffer).digest('hex'));
        //...

What creates the difference here, and how do I get the 'true' downloaded data?是什么造成了这里的差异,我如何获得“真实”的下载数据?

From axios docs :来自axios 文档

 // `responseType` indicates the type of data that the server will respond with // options are: 'arraybuffer', 'document', 'json', 'text', 'stream' // browser only: 'blob' responseType: 'json', // default

'blob' is a "browser only" option. 'blob'是一个“仅限浏览器”的选项。

So from node.js, when you set responseType: "blob" , "json" will actually be used, which I guess fallbacks to "text" when no parse-able JSON data has been fetched.因此,从 node.js 开始,当您设置responseType: "blob" ,实际上会使用"json" ,我猜当没有获取可解析的 JSON 数据时,它会回退到"text"

Fetching binary data as text is prone to generate corrupted data.以文本形式获取二进制数据很容易产生损坏的数据。 Because the text returned by Body.text() and many other APIs are USVStrings (they don't allow unpaired surrogate codepoints ) and because the response is decoded as UTF-8, some bytes from the binary file can't be mapped to characters correctly and will thus be replaced by (U+FFDD) replacement character, with no way to get back what that data was before: your data is corrupted.因为Body.text()和许多其他 API 返回的文本是USVStrings (它们不允许不成对的代理代码点)并且因为响应被解码为 UTF-8,所以二进制文件中的某些字节无法映射到字符正确,因此将被替换为 (U+FFDD) 替换字符,无法恢复之前的数据:您的数据已损坏。

Here is a snippet explaining this, using the header of a .png file 0x89 0x50 0x4E 0x47 as an example.这是解释这一点的片段,以 .png 文件0x89 0x50 0x4E 0x47的标题为例。

 (async () => { const url = 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png'; // fetch as binary const buffer = await fetch( url ).then(resp => resp.arrayBuffer()); const header = new Uint8Array( buffer ).slice( 0, 4 ); console.log( 'binary header', header ); // [ 137, 80, 78, 61 ] console.log( 'entity encoded', entityEncode( header ) ); // [ "U+0089", "U+0050", "U+004E", "U+0047" ] // You can read more about (U+0089) character here // https://www.fileformat.info/info/unicode/char/0089/index.htm // You can see in the left table how this character in UTF-8 needs two bytes (0xC2 0x89) // We thus can't map this character correctly in UTF-8 from the UTF-16 codePoint, // it will get discarded by the parser and converted to the replacement character // read as UTF-8 const utf8_str = await new Blob( [ header ] ).text(); console.log( 'read as UTF-8', utf8_str ); // " PNG" // build back a binary array from that string const utf8_binary = [ ...utf8_str ].map( char => char.charCodeAt( 0 ) ); console.log( 'Which is binary', utf8_binary ); // [ 65533, 80, 78, 61 ] console.log( 'entity encoded', entityEncode( utf8_binary ) ); // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ] // You can read more about character   (U+FFDD) here // https://www.fileformat.info/info/unicode/char/0fffd/index.htm // // P (U+0050), N (U+004E) and G (U+0047) characters are compatible between UTF-8 and UTF-16 // For these there is no encoding lost // (that's how base64 encoding makes it possible to send binary data as text) // now let's see what fetching as text holds const fetched_as_text = await fetch( url ).then( resp => resp.text() ); const header_as_text = fetched_as_text.slice( 0, 4 ); console.log( 'fetched as "text"', header_as_text ); // " PNG" const as_text_binary = [ ...header_as_text ].map( char => char.charCodeAt( 0 ) ); console.log( 'Which is binary', as_text_binary ); // [ 65533, 80, 78, 61 ] console.log( 'entity encoded', entityEncode( as_text_binary ) ); // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ] // It's been read as UTF-8, we lost the first byte. })(); function entityEncode( arr ) { return Array.from( arr ).map( val => 'U+' + toHex( val ) ); } function toHex( num ) { return num.toString( 16 ).padStart(4, '0').toUpperCase(); }


There is natively no Blob object in node.js, so it makes sense axios didn't monkey-patch it just so they can return a response no-one else would be able to consume anyway. node.js 中本来就没有 Blob 对象,所以 axios 没有对它进行猴子修补是有道理的,这样它们就可以返回一个其他人无论如何都无法使用的响应。

From a browser, you'd have exactly the same responses:在浏览器中,您会得到完全相同的响应:

 function fetchAs( type ) { return axios( { method: 'get', url: 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png', responseType: type } ); } function loadImage( data, type ) { // we can all pass them to the Blob constructor directly const new_blob = new Blob( [ data ], { type: 'image/jpg' } ); // with blob: URI, the browser will try to load 'data' as-is const url = URL.createObjectURL( new_blob ); img = document.getElementById( type + '_img' ); img.src = url; return new Promise( (res, rej) => { img.onload = e => res(img); img.onerror = rej; } ); } [ 'json', // will fail 'text', // will fail 'arraybuffer', 'blob' ].forEach( type => fetchAs( type ) .then( resp => loadImage( resp.data, type ) ) .then( img => console.log( type, 'loaded' ) ) .catch( err => console.error( type, 'failed' ) ) );
 <script src="https://unpkg.com/axios/dist/axios.min.js"></script> <figure> <figcaption>json</figcaption> <img id="json_img"> </figure> <figure> <figcaption>text</figcaption> <img id="text_img"> </figure> <figure> <figcaption>arraybuffer</figcaption> <img id="arraybuffer_img"> </figure> <figure> <figcaption>blob</figcaption> <img id="blob_img"> </figure>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM