简体   繁体   中英

Conflicting answers on converting int to byte array of length 2 in java

I'm trying to convert integers into 2 bytes, but I'm seeing some conflicting answers online between:

a[0] = (byte)(theInt & 0xFF);
a[1] = (byte)((theInt >>> 8) & 0xFF);

and

a[0] = (byte)((theInt >>> 8) & 0xFF);
a[1] = (byte)(theInt & 0xFF);

The first seems to be the more common answer ( How do I split an integer into 2 byte binary? ).

However, for me personally, the second seems to be working better. If I set theInt = 10000 , I get the desired {27, 10} . But for the first method I get the reverse {10, 27} .

So is there any risk in me going against the popular answer and using the first method? Am I missing something? Thanks

The concept of how to sequence the bytes of a multibyte numeric value (such as the 16-bit value you're converting here) is called 'endianness'.

Little Endian is the term for 'send the least significant data first', that'd be the first snippet. (send '10' first, then '27').

Big Endian is the term for 'send the most significant data first', so, the second snippet.

You'd think Big Endian is sensible (it matches how we humans write things, for example, and matches how we think about bits within bytes too; 128 is, in bits, '10000000' - the most significant data is written first, after all), and Little Endian is insane, and wonder why the concept of LE even exists.

The primary reason for that is that the intel CPU architecture is Little Endian, and that is very popular CPU architecture. If you first write 32-bit int with value '1' to some memory address, and then read it back out byte for byte, then you get, in order: '1', '0', '0', and '0' out: Little Endian - the least significant byte is written first. These days with pipelines, micro architecture and who knows what asking an intel processor to write it out in BE form is probably not really slower, but it is more machinecode, and certainly in the past, significantly slower. Thus, if you were trying to squeeze max performance out 2 machines talking to each other with a really really fast pipe in between and both machines had intel chips, it'd go faster if you send in little endian: That way both CPUs are just copying, that's all, vs. sending in BE which would require the sender chip to swap bytes 1/4 and 2/3 around for every int it sends, and the receiver chip to apply the same conversion, wasting cycles.

Thus, from time to time, you find a protocol defined to be LE. That's.. short sighted in this world of varied hardware where you're just as likely to end up having both sender and receiver eg by an ARM chip, or worse, for 10 chips to be involved (bunch of packet-inspecting routers in between), probably all of them BE. But now you know why LE as a concept does exist.

Because of this modern age of varied hardware, and because most other CPUs are big endian, almost all network protocols are generally defined to be Big Endian. java is generally Big Endian as well (most APIs, such as IntBuffer and co, let you pick which endian-ness you want, but where it is not available, or where defaults are concerned, it's big endian). Formats like UTF-8 are also defined as being big-endian. When in doubt, Big Endian is far more likely to be the intended ordering than LE would be.

The ARM chips that run android devices are also Big Endian.

Thus: Just use Big Endian (second snippet).

That just leaves one mystery: Why is the accepted answer to your linked question the 'weird' one (Little Endian), and why does it get that many upvotes even though it doesn't highlight this? The question even specifically asks for Big Endian (it describes it, doesn't use the term of art for it, but nevertheless, it describes BE).

I don't know. It's a stupid answer with a checkbox and 68 votes. Mysterious.

I did my part, and downvoted it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM