简体   繁体   English

如何在浏览器中正确将pdf文件转换为base64?

[英]how to correctly convert pdf file to base64 in browser?

I have three failing versions of the following code in a chrome extension, which attempts to intercept a click to a link pointing to a pdf file, fetch that file, convert it to base64, and then log it. 我在chrome扩展程序中具有以下代码的三个失败版本,该版本试图拦截指向pdf文件的链接的单击,获取该文件,将其转换为base64,然后进行记录。 But I'm afraid I don't really know anything about binary formats and encodings, so I'm royally sucking this up. 但是恐怕我对二进制格式和编码一无所知,所以我很想知道这一点。

var links = document.getElementsByTagName("a");

function transform(blob) {
    return btoa(String.fromCharCode.apply(null, new Uint8Array(blob)));
};

function getlink(link) {
    var x = new XMLHttpRequest();
    x.open("GET", link, true);
    x.responseType = 'blob';
    x.onload = function(e) {
        console.log("Raw response:");
        console.log(x.response);
        console.log("Direct transformation:");
        console.log(btoa(x.response));
        console.log("Mysterious thing I got from SO:");
        console.log(transform(x.response));
        window.location.href = link;
    };

    x.onerror = function (e) {
        console.error(x.statusText);
    };

    x.send(null);
};

for (i = 0, len = links.length; i < len; i++) {
    var l = links[i]
    l.addEventListener("click", function(e) {
        e.preventDefault();
        e.stopPropagation();
        e.stopImmediatePropagation();
        getlink(this.href);
    }, false);
};

Version 1 doesn't have the call to x.responseType , or the call to transform . 版本1不具备呼叫x.responseType ,或将呼叫transform It was my original, naive, implementation. 这是我最初的天真实现。 It threw an error: "The string to be encoded contains characters outside of the Latin1 range." 它引发了一个错误:“要编码的字符串包含Latin1范围之外的字符。”

After googling that error, I found this prior SO , which suggests that in parsing an image: 搜寻完该错误之后,我发现了先前的SO ,这表明在解析图像时:

  1. The response type needs to be set to blob. 响应类型需要设置为blob。 So this code does that. 所以这段代码做到了。
  2. There's some weird line, I don't know what it does at all: String.fromCharCode.apply(null, new Uint8Array(blob)) . 有一些奇怪的行,我根本不知道它是做什么的: String.fromCharCode.apply(null, new Uint8Array(blob))

Because I know nothing about binary formats, I guessed, probably stupidly, that making a PDF base64 would be the same as making some random image format base64. 因为我对二进制格式一无所知,所以我可能很愚蠢地猜测,制作PDF base64与制作某种随机图像格式base64相同。 So, in fine SO tradition, I copied code that I don't really understand. 因此,按照优良的传统,我复制了我不太了解的代码。 In stages. 分阶段进行。

Version 2 of the code just set the response type to blob but didn't try the second transformation. 代码的版本2只是将响应类型设置为blob,但没有尝试第二次转换。 And the code worked, and logged something that looked like a base64 string, but a clearly incorrect string. 并且代码工作了,并记录了一些看起来像base64字符串,但明显不正确的字符串。 In its entirety, it logged: 整个记录如下:

W29iamVjdCBCbG9iXQ== W29iamVjdCBCbG9iXQ ==

Which is just goofily wrong. 这只是愚蠢的错误。 It's obviously too short for a 46k pdf file, and a reference base64 encoding I created with python from the commandline was much much much longer, as one would expect. 对于46k的pdf文件来说,它显然太短了,正如我所期望的那样,我从命令行使用python创建的参考base64编码要长得多。

Version 3 of the code then also applies the mysterious transformation using stringFromCharCode and all the rest, which I shoved into the transform function. 然后,该代码的版本3还使用stringFromCharCode和所有其余部分应用了神秘的转换,这些transform推入transform函数中。

However, that doesn't log anything at all---a blank line appears in the console in its appropriate place. 但是,这根本不记录任何内容-在控制台的适当位置出现空白行。 No errors, no nonsense output, just a blank line. 没有错误,没有废话输出,只是空白行。

I know I'm getting the correct file from prior testing. 我知道我从先前的测试中得到了正确的文件。 Also, the call to log the raw response object produces Blob {size: 45587, type: "application/pdf"} , which is the correct filesize for the pdf I'm experimenting with, so the blob actually contains what it should when it gets into the browser. 另外,记录原始响应对象的调用会生成Blob {size: 45587, type: "application/pdf"} ,这是我正在尝试使用的pdf的正确文件大小,因此,blob实际上包含它应包含的内容进入浏览器。

I'm using, and only need to support, a current version of chrome. 我正在使用并且仅需要支持chrome的当前版本。

Can someone tell me what I'm doing wrong? 有人可以告诉我我在做什么错吗?

Thanks! 谢谢!

If you only need to support modern browsers, you should also be able to use FileReader#readAsDataURL . 如果仅需要支持现代浏览器,则还应该能够使用FileReader#readAsDataURL

That would let you do something like this: 那会让你做这样的事情:

var reader  = new FileReader();
reader.addEventListener("load", function () {
  console.log(reader.result);
}, false);
// The function accepts Blobs and Files
reader.readAsDataURL(x.response);

This logs a data URI , which will contain your base64 data. 这将记录一个数据URI ,其中将包含您的base64数据。

I think I've found my own solution. 我想我已经找到了自己的解决方案。 The response type needs to be arraybuffer not blob . 响应类型需要是arraybuffer而不是blob

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM