如何在浏览器中正确将pdf文件转换为base64？

Question

I have three failing versions of the following code in a chrome extension, which attempts to intercept a click to a link pointing to a pdf file, fetch that file, convert it to base64, and then log it. 我在chrome扩展程序中具有以下代码的三个失败版本，该版本试图拦截指向pdf文件的链接的单击，获取该文件，将其转换为base64，然后进行记录。 But I'm afraid I don't really know anything about binary formats and encodings, so I'm royally sucking this up. 但是恐怕我对二进制格式和编码一无所知，所以我很想知道这一点。

var links = document.getElementsByTagName("a");

function transform(blob) {
    return btoa(String.fromCharCode.apply(null, new Uint8Array(blob)));
};

function getlink(link) {
    var x = new XMLHttpRequest();
    x.open("GET", link, true);
    x.responseType = 'blob';
    x.onload = function(e) {
        console.log("Raw response:");
        console.log(x.response);
        console.log("Direct transformation:");
        console.log(btoa(x.response));
        console.log("Mysterious thing I got from SO:");
        console.log(transform(x.response));
        window.location.href = link;
    };

    x.onerror = function (e) {
        console.error(x.statusText);
    };

    x.send(null);
};

for (i = 0, len = links.length; i < len; i++) {
    var l = links[i]
    l.addEventListener("click", function(e) {
        e.preventDefault();
        e.stopPropagation();
        e.stopImmediatePropagation();
        getlink(this.href);
    }, false);
};

Version 1 doesn't have the call to x.responseType , or the call to transform . 版本1不具备呼叫x.responseType ，或将呼叫transform 。 It was my original, naive, implementation. 这是我最初的天真实现。 It threw an error: "The string to be encoded contains characters outside of the Latin1 range." 它引发了一个错误：“要编码的字符串包含Latin1范围之外的字符。”

After googling that error, I found this prior SO , which suggests that in parsing an image: 搜寻完该错误之后，我发现了先前的SO ，这表明在解析图像时：

The response type needs to be set to blob. 响应类型需要设置为blob。 So this code does that. 所以这段代码做到了。
There's some weird line, I don't know what it does at all: String.fromCharCode.apply(null, new Uint8Array(blob)) . 有一些奇怪的行，我根本不知道它是做什么的： String.fromCharCode.apply(null, new Uint8Array(blob)) 。

Because I know nothing about binary formats, I guessed, probably stupidly, that making a PDF base64 would be the same as making some random image format base64. 因为我对二进制格式一无所知，所以我可能很愚蠢地猜测，制作PDF base64与制作某种随机图像格式base64相同。 So, in fine SO tradition, I copied code that I don't really understand. 因此，按照优良的传统，我复制了我不太了解的代码。 In stages. 分阶段进行。

Version 2 of the code just set the response type to blob but didn't try the second transformation. 代码的版本2只是将响应类型设置为blob，但没有尝试第二次转换。 And the code worked, and logged something that looked like a base64 string, but a clearly incorrect string. 并且代码工作了，并记录了一些看起来像base64字符串，但明显不正确的字符串。 In its entirety, it logged: 整个记录如下：

W29iamVjdCBCbG9iXQ== W29iamVjdCBCbG9iXQ ==

Which is just goofily wrong. 这只是愚蠢的错误。 It's obviously too short for a 46k pdf file, and a reference base64 encoding I created with python from the commandline was much much much longer, as one would expect. 对于46k的pdf文件来说，它显然太短了，正如我所期望的那样，我从命令行使用python创建的参考base64编码要长得多。

Version 3 of the code then also applies the mysterious transformation using stringFromCharCode and all the rest, which I shoved into the transform function. 然后，该代码的版本3还使用stringFromCharCode和所有其余部分应用了神秘的转换，这些transform推入transform函数中。

However, that doesn't log anything at all---a blank line appears in the console in its appropriate place. 但是，这根本不记录任何内容-在控制台的适当位置出现空白行。 No errors, no nonsense output, just a blank line. 没有错误，没有废话输出，只是空白行。

I know I'm getting the correct file from prior testing. 我知道我从先前的测试中得到了正确的文件。 Also, the call to log the raw response object produces Blob {size: 45587, type: "application/pdf"} , which is the correct filesize for the pdf I'm experimenting with, so the blob actually contains what it should when it gets into the browser. 另外，记录原始响应对象的调用会生成Blob {size: 45587, type: "application/pdf"} ，这是我正在尝试使用的pdf的正确文件大小，因此，blob实际上包含它应包含的内容进入浏览器。

I'm using, and only need to support, a current version of chrome. 我正在使用并且仅需要支持chrome的当前版本。

Can someone tell me what I'm doing wrong? 有人可以告诉我我在做什么错吗？

Thanks! 谢谢！

Answer 1

If you only need to support modern browsers, you should also be able to use FileReader#readAsDataURL . 如果仅需要支持现代浏览器，则还应该能够使用FileReader＃readAsDataURL 。

That would let you do something like this: 那会让你做这样的事情：

var reader  = new FileReader();
reader.addEventListener("load", function () {
  console.log(reader.result);
}, false);
// The function accepts Blobs and Files
reader.readAsDataURL(x.response);

This logs a data URI , which will contain your base64 data. 这将记录一个数据URI ，其中将包含您的base64数据。

Answer 2

I think I've found my own solution. 我想我已经找到了自己的解决方案。 The response type needs to be arraybuffer not blob . 响应类型需要是arraybuffer而不是blob 。

如何在浏览器中正确将pdf文件转换为base64？

问题描述

2 个解决方案

解决方案1
1 2016-07-13 04:08:46

解决方案2
0 2016-07-13 01:38:45

如何在浏览器中正确将pdf文件转换为base64？

问题描述

2 个解决方案

解决方案1 1 2016-07-13 04:08:46

解决方案2 0 2016-07-13 01:38:45

解决方案1
1 2016-07-13 04:08:46

解决方案2
0 2016-07-13 01:38:45