[英]Encode binary data as ASCII in Java
I have a bitset of binary data that I wish to encode compactly as an ASCII string. 我有一个二进制数据的位集,希望将其紧凑地编码为ASCII字符串。 I intend to initially compress the data using run-length encoding to give a sequence of integers;
我打算首先使用游程长度编码来压缩数据,以给出整数序列; eg
例如
111110001000000000000111
becomes: 变成:
5o3z1o12z3o
(eg 5 ones, 3 zeros, 1 one, 12 zeros, 3 ones). (例如5个,3个零,1个,12个零,3个)。
However, I wish to then compress this further into a compact ASCII string (ie a string using the full range of ASCII characters rather than the digits plus 'o' and 'z'). 但是,我希望将其进一步压缩为紧凑的ASCII字符串(即,使用全部ASCII字符而不是数字加'o'和'z'的字符串)。 Can anyone recommend a suitable approach and / or 3rd party library to do this in Java?
任何人都可以推荐合适的方法和/或第三方库来用Java做到这一点吗?
If your goal is compression, just gzip the stream. 如果您的目标是压缩,则只需gzip流。 It's going to do better than your run-length encoding.
它将比游程长度编码做得更好。
Then if you need it to be text for some reason, like to safely pass through old mail gateways, I'd also turn to a standard encoding like Base64, rather than make up your own. 然后,如果出于某种原因而需要将其作为文本,例如为了安全地通过旧的邮件网关,我还将使用Base64之类的标准编码,而不是自己编写。
But if you want to roll your own: first I'd note that you don't need the 'o' and 'z'. 但是,如果您想自己动手做:首先请注意,您不需要'o'和'z'。 You already know those values since they alternate.
您已经知道这些值,因为它们交替出现。 Assume it starts on 0 (and if it doesn't, encode an initial 0 to show that there are 0 0s).
假设它从0开始(如果不是,则将其编码为初始0以显示存在0 0)。
Encoding the numbers textually is possible but probably inefficient. 可以通过文本对数字进行编码,但可能效率不高。 Look into a variable-length encoding for integer values, then encode those bytes.
查看整数值的可变长度编码,然后对那些字节进行编码。 Then 'escape' them into ASCII somehow.
然后以某种方式将它们“转义”为ASCII。
But then we're back to Base64-like encoding, and the first suggestion to gzip + Base64 is probably easier than all of this. 但是,我们又回到了类似Base64的编码方式,并且对gzip + Base64的第一个建议可能比所有这些都容易。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.