简体   繁体   English

Java中的编码/解码字节如何工作?

[英]How does encoding/decoding bytes work in Java?

Little background: I'm doing cryptopals challenges and I finished https://cryptopals.com/sets/1/challenges/1 but realized I didn't learn what I guess is meant to be learned (or coded). 小背景:我正在做密码挑战,完成了https://cryptopals.com/sets/1/challenges/1,但是意识到我没有学到我想学习的东西(或编码的东西)。

I'm using the Apache Commons Codec library for Hex and Base64 encoding/decoding. 我正在使用Apache Commons Codec库进行Hex和Base64编码/解码。 The goal is to decode the hex string and re-encode it to Base64. 目标是解码十六进制字符串,然后将其重新编码为Base64。 The "hint" at the bottom of the page says "Always operate on raw bytes, never on encoded strings. Only use hex and base64 for pretty-printing." 页面底部的“提示”显示“始终对原始字节进行操作,而不对编码的字符串进行操作。仅使用十六进制和base64进行漂亮的打印。”

Here's my answer... 这是我的答案

private static Hex forHex = new Hex();
private static Base64 forBase64 = new Base64();

public static  byte[] hexDecode(String hex) throws DecoderException {
    byte[] rawBytes = forHex.decode(hex.getBytes());
    return rawBytes;
}
public static byte[] encodeB64(byte[] bytes) {
    byte[] base64Bytes = forBase64.encode(bytes);
    return base64Bytes;
}

public static void main(String[] args) throws DecoderException {

String hex = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d";


//decode hex String to byte[]
byte[] myHexDecoded = hexDecode(hex);
String myHexDecodedString = new String(myHexDecoded);

//Lyrics from Queen's "Under Pressure"
System.out.println(myHexDecodedString);

//encode myHexDecoded to Base64 encoded byte[]
byte[] myHexEncoded = encodeB64(myHexDecoded);
String myB64String = new String(myHexEncoded);

//"pretty printing" of base64
System.out.println(myB64String);

} }

...but I feel like I cheated. ...但是我觉得自己被骗了。 I didn't learn how to decode bytes that were encoded as hex, and I didn't learn how to encode "pure" bytes to Base64, I just learned how to use a library to do something for me. 我没有学习如何解码以十六进制编码的字节,也没有学习如何将“纯”字节编码为Base64,而是学习了如何使用库为我做些事情。

If I were to take a String in Java then get its bytes, how would I encode those bytes into hex? 如果我要使用Java的String来获取其字节,那么如何将这些字节编码为十六进制? For example, the following code snip turns "Hello" (which is readable English) to the byte value of each character: 例如,以下代码片段将“ Hello”(英语可读)变成每个字符的字节值:

String s = "Hello";
char[] sChar = s.toCharArray();
byte[] sByte = new byte[sChar.length]
for(int i = 0; i < sChar.length; i++) {
    sByte[i] = (byte) sChar[i];
    System.out.println("sByte[" + i + "] = " +sByte[i]);
}

which yields sByte[0] = 72, sByte[1] = 101, sByte[2] = 108, sByte[3] = 108, sByte[4] = 111 得出sByte [0] = 72,sByte [1] = 101,sByte [2] = 108,sByte [3] = 108,sByte [4] = 111

Lets use 'o' as an example - I am guessing its decimal version is 111 - do I just take its decimal version and change that to its hex version? 让我们以“ o”为例-我猜它的十进制版本是111-我是否只是将其十进制版本更改为十六进制版本?

If so, to decode, do I just take the the characters in the hex String 2 at a time, decompose them to decimal values, then convert to ASCII? 如果是这样,为了进行解码,我是否一次只获取十六进制字符串2中的字符,将它们分解为十进制值,然后转换为ASCII? Will it always be ASCII? 会永远是ASCII吗?

to decode, do I just take the the characters in the hex String 2 at a time, decompose them to decimal values, then convert to ASCII? 要解码,我是否一次只获取十六进制字符串2中的字符,将它们分解为十进制值,然后转换为ASCII? Will it always be ASCII? 会永远是ASCII吗?

No. You take the characters 2 at a time, transform the character '0' to the numeric value 0, the character '1' to the numeric value 1, ..., the character 'a' (or 'A', depending on which encoding you want to support) to the numeric value 10, ..., the character 'f' or 'F' to the numeric value 15. 否。您一次获取两个字符,将字符“ 0”转换为数值0,将字符“ 1”转换为数值1,...,将字符“ a”(或“ A”,具体取决于您要支持哪种编码)(数字10,...,字符“ f”或“ F”数字15)。

Then you multiply the first numeric value by 16, and you add it to the second numeric value to get the unsigned integer value of your byte. 然后,将第一个数字值乘以16,然后将其添加到第二个数字值以获取字节的无符号整数值。 Then you transform that unsigned integer value to a signed byte. 然后,您将该无符号整数值转换为有符号字节。

ASCII has nothing to do with this algorithm. ASCII与该算法无关。

To see how it's done in practice, since commons-codec is open-source, you can just look at its implementation. 由于commons编解码器是开源的,因此要在实践中了解它是如何实现的,您只需查看其实现即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM