Bit Twiddling: Encoding Unsigned Primitives

Question

Below is a class that encodes unsigned primitive types as a byte array and returns an encoded byte array as a decimal string. I understand conceptually how encodeIntBigEndian & byteArrayToDecimalString work. However, I'd appreciate clarity on:

Why/how shifting the val by ((size - i - 1) * Byte.SIZE) produces an unsigned java byte value.

Also, why does applying a byte mask of 0xff convert the byte to a decimal string value.

public class BruteForceCoding {
 private static byte byteVal = 101; // one hundred and one
 private static short shortVal = 10001; // ten thousand and one
 private static int intVal = 100000001; // one hundred million and one
 private static long longVal = 1000000000001L;// one trillion and one

 private final static int BSIZE = Byte.SIZE / Byte.SIZE;
 private final static int SSIZE = Short.SIZE / Byte.SIZE;
 private final static int ISIZE = Integer.SIZE / Byte.SIZE;
 private final static int LSIZE = Long.SIZE / Byte.SIZE;

 private final static int BYTEMASK = 0xFF; // 8 bits
 public static String byteArrayToDecimalString(byte[] bArray) {
  StringBuilder rtn = new StringBuilder();
  for (byte b : bArray) {
   rtn.append(b & BYTEMASK).append(" ");
  }
  return rtn.toString();
 }

 public static int encodeIntBigEndian(byte[] dst, long val, int offset, int size) {
  for (int i = 0; i < size; i++) {
   dst[offset++] = (byte) (val >> ((size - i - 1) * Byte.SIZE));
  }
  return offset;
 }

 public static void main(String[] args) {
  byte[] message = new byte[BSIZE + SSIZE + ISIZE + LSIZE];
  // Encode the fields in the target byte array
  int offset = encodeIntBigEndian(message, byteVal, 0, BSIZE);
  offset = encodeIntBigEndian(message, shortVal, offset, SSIZE);
  offset = encodeIntBigEndian(message, intVal, offset, ISIZE);
  encodeIntBigEndian(message, longVal, offset, LSIZE);
  System.out.println("Encoded message: " + byteArrayToDecimalString(message));
 }
}

Answer 1

1) It doesn't, by itself. What it does is it shifts the value down by units of one byte. However, when combined with the cast to (byte) , which discards the high bits, this amounts to a shift-and-mask operation which extracts individual bytes from the value.

2) It doesn't. It masks off the high bits, leaving the low eight bits (one byte) of the value -- the same operation that casting to byte performed in the previous case. However, by default rendering a byte as a string produces a string containing the decimal number expressing its binary value (from 0 to 255), and that's happening implicitly when .append() is called.

Answer 2

Why/how shifting the val by ((size - i - 1) * Byte.SIZE) produces an unsigned java byte value.

It doesn't. >> is a sign extending shift so it doesn't change the sign of its left argument. >>> by a non-zero number of bits is guaranteed to produce a non-negative result, so could be considered having an unsigned output.

Either way, as soon as it's cast to a byte , the value is signed again since Java has no unsigned byte type.

Also, why does applying a byte mask of 0xff convert the byte to a decimal string value.

It doesn't.

Decimal conversion happens at

 rtn.append(b & BYTEMASK).append(" ")

(b & BYTEMASK) has type int due to type-promotion and is a value in the range [0, 256), and StringBuilder.append(int) is documented as appending the decimal representation of its argument.

UPDATE:

To understand

 for (int i = 0; i < size; i++) { dst[offset++] = (byte) (val >> ((size - i - 1) * Byte.SIZE)); }

consider what it does for a size of 4 which corresponds to a 4-byte/32-bit java int .

dst[offset  ] = (byte) (val >> 24);  // (byte) 0x??????01 == 0x01
dst[offset+1] = (byte) (val >> 16);  // (byte) 0x????0123 == 0x23
dst[offset+2] = (byte) (val >>  8);  // (byte) 0x??012345 == 0x45
dst[offset+3] = (byte) (val >>  0);  // (byte) 0x01234567 == 0x67

so given int 0x01234567 it will put into dst the bytes 0x01 0x23 0x45 0x67 in order.

Bit Twiddling: Encoding Unsigned Primitives

Question

2 answers

solution1
0 2014-01-27 21:58:13

solution2
0 2014-01-27 22:05:41

Bit Twiddling: Encoding Unsigned Primitives

Question

2 answers

solution1 0 2014-01-27 21:58:13

solution2 0 2014-01-27 22:05:41

solution1
0 2014-01-27 21:58:13

solution2
0 2014-01-27 22:05:41