[英]Is there a way to redefine the standard Ascii character set that Javascript charCodeAt and fromCharCode calls from within a function?
For encoding, Javascript pulls from the standard Anscii table for mapping characters.对于编码,Javascript 从标准 Anscii 表中提取用于映射字符。 I found the following function below that brilliantly and correctly encodes to Anscii85/Base85.
我在下面发现了以下函数,它可以出色且正确地编码为 Anscii85/Base85。 But I want to encode to the Z85 variation because it contains the set of symbols that I require.
但我想编码到 Z85 变体,因为它包含我需要的一组符号。 My understanding is that the Anscii85/Base85 encoding should work exactly the same, except that Z85 maps the values in a different order from the Anscii standard, and uses a different combination of symbols from the standard Ansii85 mapping as well.
我的理解是 Anscii85/Base85 编码应该完全相同,除了 Z85 以与 Anscii 标准不同的顺序映射值,并且使用与标准 Ansii85 映射不同的符号组合。 So the character set is the only difference:
所以字符集是唯一的区别:
Ansci85 uses the 85 characters, 32 through 126 ( reference ): "!\\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\\\]^_`abcdefghijklmnopqrstu
Ansci85 使用 85 个字符,32 到 126(参考):
"!\\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\\\]^_`abcdefghijklmnopqrstu
Z85 uses a custom set of 85 characters ( reference ): 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.-:+=^!/*?&<>()[]{}@%$#
Z85 使用自定义的 85 个字符集(参考):
0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.-:+=^!/*?&<>()[]{}@%$#
My question is, is there any way to redefine the character set that charCodeAt and fromCharCode refer to in this function so that it would then encode in Z85?我的问题是,有没有办法重新定义 charCodeAt 和 fromCharCode 在这个函数中引用的字符集,以便它可以在 Z85 中编码?
// By Steve Hanov. Released to the public domain.
function encodeAscii85(input) {
// Remove Adobe standard prefix
// var output = "<~";
var chr1, chr2, chr3, chr4, chr, enc1, enc2, enc3, enc4, enc5;
var i = 0;
while (i < input.length) {
// Access past the end of the string is intentional.
chr1 = input.charCodeAt(i++);
chr2 = input.charCodeAt(i++);
chr3 = input.charCodeAt(i++);
chr4 = input.charCodeAt(i++);
chr = ((chr1 << 24) | (chr2 << 16) | (chr3 << 8) | chr4) >>> 0;
enc1 = (chr / (85 * 85 * 85 * 85) | 0) % 85 + 33;
enc2 = (chr / (85 * 85 * 85) | 0) % 85 + 33;
enc3 = (chr / (85 * 85) | 0 ) % 85 + 33;
enc4 = (chr / 85 | 0) % 85 + 33;
enc5 = chr % 85 + 33;
output += String.fromCharCode(enc1) +
String.fromCharCode(enc2);
if (!isNaN(chr2)) {
output += String.fromCharCode(enc3);
if (!isNaN(chr3)) {
output += String.fromCharCode(enc4);
if (!isNaN(chr4)) {
output += String.fromCharCode(enc5);
}
}
}
}
// Remove Adobe standard suffix
// output += "~>";
return output;
}
Extra notes:额外说明:
Alternately, I thought I could use something like the following function , but the problem is that it doesn't properly encode Anscii85 in the first place.或者,我想我可以使用类似下面的函数,但问题是它首先没有正确编码 Anscii85。 If it was correct,
Hello world!
如果它是正确的,
Hello world!
should encode to 87cURD]j7BEbo80
, but this function encodes it to RZ!iCB=*gD0D5_+
( reference ).应该编码为
87cURD]j7BEbo80
,但此函数将其编码为RZ!iCB=*gD0D5_+
(参考)。
I don't understand the algorithm enough to know what is wrong with the mapping here.我不太了解算法,无法知道这里的映射有什么问题。 Ideally, if it was encoding correctly, I should be able to update this function to use the Z85 character set:
理想情况下,如果编码正确,我应该能够更新此函数以使用 Z85 字符集:
// Adapted from: Ascii85 JavaScript implementation, 2012.10.16 Jim Herrero
// Original: https://jsfiddle.net/nderscore/bbKS4/
var Ascii85 = {
// Ascii85 mapping
_alphabet: "!\"#$%&'()*+,-./0123456789:;<=>?@"+
"ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`"+
"abcdefghijklmnopqrstu"+
"y"+ // short form 4 spaces (optional)
"z", // short form 4 nulls (optional)
// functions
encode: function(input) {
var alphabet = Ascii85._alphabet,
useShort = alphabet.length > 85,
output = "", buffer, val, i, j, l;
for (i = 0, l = input.length; i < l;) {
buffer = [0,0,0,0];
for (j = 0; j < 4; j++)
if(input[i])
buffer[j] = input.charCodeAt(i++);
for (val = buffer[3], j = 2; j >= 0; j--)
val = val*256+buffer[j];
if (useShort && !val)
output += alphabet[86];
else if (useShort && val == 0x20202020)
output += alphabet[85];
else {
for (j = 0; j < 5; j++) {
output += alphabet[val%85];
val = Math.floor(val/85);
}
}
}
return output;
}
};
Character codes are character codes.字符代码是字符代码。 You can't change the behavior of
String.fromCharCode()
or String.charCodeAt()
.您无法更改
String.fromCharCode()
或String.charCodeAt()
的行为。
However, you can store your custom character set in an array and use array indexing and Array.indexOf()
to look up entries.但是,您可以将自定义字符集存储在数组中,并使用数组索引和
Array.indexOf()
来查找条目。
Updating this function to work with Z85 will be tricky, though, because String.fromCharCode()
and String.charCodeAt()
are used in two different contexts -- they're sometimes used to access the unencoded string (which doesn't need to change), and sometimes for the encoded string (which does).但是,更新此函数以使用 Z85 会很棘手,因为
String.fromCharCode()
和String.charCodeAt()
用于两种不同的上下文——它们有时用于访问未编码的字符串(不需要更改),有时是编码字符串(确实如此)。 You will need to take care to not confuse the two.您需要注意不要混淆两者。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.