简体   繁体   English

如何从 Javascript 中的 Base64 编码字符串获取 UTF8 编码字符串?

[英]How to get a UTF8 encoded string from a Base64 encoded string in Javascript?

AEEAQgBDAGEAYgBj is Base64 for ABCabc however when I run this code: AEEAQgBDAGEAYgBj是的Base64 ABCabc但是当我运行此代码:

let a = Buffer.from('AEEAQgBDAGEAYgBj', 'base64').toString('utf-8');
console.log(Buffer.from(a, 'utf8'));

the printed result in the console is <Buffer 00 41 00 42 00 43 00 61 00 62 00 63> which is UTF16 (LE).控制台中的打印结果是<Buffer 00 41 00 42 00 43 00 61 00 62 00 63>这是 UTF16 (LE)。

I would assume that since I am creating a Buffer from a UTF8 encoded string the result would be <Buffer 41 42 43 61 62 63> .我会假设,因为我是从一个 UTF8 编码的字符串创建一个 Buffer 结果将是<Buffer 41 42 43 61 62 63> So how can I get an actual UTF8 encoded string from Base64?那么如何从 Base64 中获取实际的 UTF8 编码字符串呢?

The problem is that your original data is Base64 encoded UTF16-BE.问题是您的原始数据是 Base64 编码的 UTF16-BE。 If you look at a after your first line, you'll see that it has those zero bytes that you see in the final buffer:如果您查看第一行之后的a ,您会看到它具有您在最终缓冲区中看到的那些零字节:

let a = Buffer.from("AEEAQgBDAGEAYgBj", "base64").toString("utf-8");
console.log(a.length);
// 12
console.log([...a].map(ch => ch.charCodeAt(0).toString(16).padStart(2, "0")).join(" "));
// 00 41 00 42 00 43 00 61 00 62 00 63

So the question becomes: How to read the UTF16-BE text you have in the buffer from Buffer.from("AEEAQgBDAGEAYgBj", "base64") .所以问题变成了:如何从Buffer.from("AEEAQgBDAGEAYgBj", "base64")读取缓冲区中的 UTF16-BE 文本。 Node.js's Buffer doesn't support UTF16-BE directly (there is no "utf16be" encoding in its standard library), but you can get there via swap16 and then reading the buffer as UTF16-LE ( "utf16le" , which is in Node.js's standard library): Node.js的的Buffer不支持UTF-16-BE直接(没有"utf16be"在其标准库编码),但你可以通过那里swap16 ,然后读取缓冲区UTF16-LE( "utf16le" ,这在Node.js 的标准库):

let a = Buffer.from("AEEAQgBDAGEAYgBj", "base64").swap16().toString("utf16le");
console.log(a.length);
// 6
console.log(a);
// ABCabc

Now a is a normal string.现在a是一个普通字符串。 If you want a buffer containing its contents in UTF8, you can use Buffer.from(a).toString("utf8") :如果你想要一个包含 UTF8 Buffer.from(a).toString("utf8")内容的缓冲区,你可以使用Buffer.from(a).toString("utf8")

let a = Buffer.from("AEEAQgBDAGEAYgBj", "base64").swap16().toString("utf16le");
console.log(a.length);
// 6
console.log(a);
// ABCabc
let b = Buffer.from(a); // (Default is `"utf8"` but you could supply that explicitly)
console.log(b);
// <Buffer 41 42 43 61 62 63>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM