I found a java class that convert byte or char to hexadecimal value. But I cannot understand the code clearly. Can you explain what the code do or where I can find more resources about this?
public class UnicodeFormatter {
static public String byteToHex(byte b) {
// Returns hex String representation of byte b
char hexDigit[] = {
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
};
char[] array = {hexDigit[(b >> 4) & 0x0f], hexDigit[b & 0x0f]};
return new String(array);
}
static public String charToHex(char c) {
// Returns hex String representation of char c
byte hi = (byte) (c >>> 8);
byte lo = (byte) (c & 0xff);
return byteToHex(hi) + byteToHex(lo);
}
} // class
First, let us start with some definitions:
char
, in Java, occupies 2 bytes ; byte
is composed by 8 bits ; Therefore, a byte
can be represented by 2 hexadecimal digits, ie, two groups of 4 bits. This is exactly what is being done in the byteToHex
method: it firsts splits the byte in two groups of 4 bits, and then maps each into an hexadecimal symbol, using the hexDigit
array. Since the decimal value of each group of 4 bits can never be greater or equal than 16 ( 2^4
), each group will always have a mapping in the hexDigits
array.
For example, suppose you want to convert the number 29
to hexadecimal:
29
is represented in binary as 00011101
; 00011101
in two groups of 4 bits yields 0001
and 1101
; 0001
can be obtained by shifting away the least significant 4 bits ( 1101
) from the binary representation of 29
. Then, 0001
would become the first 4
bits. This is accomplished in Java with ( b >> 4
); b & 0x0f
, which is equivalent to 00011101 & 00001111 = 00001101 = 1101
. By bit- AND
ing the binary number with 0x0f
you are clearing (setting to 0) everything except the least significant 4 bits. 1
( 0001
) and 13
( 1101
), which are then mapped to 1
and D
respectively, in the hexadecimal system. 29
is therefore represented by 1D
in hexadecimal. A similar logic can be applied to the method charToHex
. The only difference is instead of converting a single byte, you are converting 2, since a char
is 2 bytes.
Basically what it is doing here is the same as turning 23 into a string by changing it to 2*10+3, then turning 2 and 3 into characters.
To break it down, we first divide by 16, since we're working in hex.
b >> 4 means shift the bits 4 spaces, so
12345678 >> 4 = 00001234
then the value in positions 1234 get looked up in the hexDigit array.
Then we do a modulus operation, also known as getting the remainder. In the decimal example, this is finding the 3 by chopping off everything to the left. For binary, they are using AND here.
0x0f in bits is 00001111, so when ANDed with a byte, it will change the left 4 spaces into 0s, leaving only the right 4.
12345678 & 0x0f = 00005678
and again we look up the value in positions 5678 in the hexDigit array. Note that I'm using 1-8 as position markers, the actual data will be all 0s and 1s.
Edit: The second function does basically the same operation, it uses the same >>> and & functions to split the unicode char into bytes. It appears to be assuming the unicode character is 16 bits, so it shifts it 8 places to get the left 8 bits, and uses & 0xff to get the right 8 bits.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.