简体   繁体   English

JAVA 和字节数组

[英]JAVA and byte arrays

I'm trying to use an API where there's a socket is used to communicate.我正在尝试使用一个 API,其中有一个套接字用于通信。 A request is made up of different parts and one of them is the header which is stated as so:请求由不同部分组成,其中之一是标头,如下所示:

Fixed header: 2 bytes, fixed at 0xffff

Generally I'm not good with bytes and streams, since I've never used it.通常我不擅长字节和流,因为我从未使用过它。 So how should i create said byte array?那么我应该如何创建所说的字节数组? I've tried the following我试过以下

byte[] header = new byte[]{(byte)0xff, (byte)0xff};

But they bytes each become -1, which I believe is because 0xFF translates to 255 which is outside of the signed byte range (-128 to +127), but then how do I create a header like that?但是它们每个字节都变成 -1,我相信这是因为 0xFF 转换为 255,这超出了有符号字节范围(-128 到 +127),但是我该如何创建这样的标头?

You just did it.你刚刚做到了。

In the end, computers just know about bits.最后,计算机只知道比特。 The rest is what the code, and the humans looking at it, make of it.剩下的就是代码和查看它的人所做的。 A bit is a 0 or a 1. If you bought a computer with 4GB RAM, then your computer can remember 34359738368 of those.位是 0 或 1。如果您购买了具有 4GB RAM 的计算机,那么您的计算机可以记住其中的 34359738368。

That's a bit unwieldy, so AMD, or intel, or TSMC, or whomever baked your chip, baked into the chip's design that the chip groups them in sets of 8 (and for certain jobs, in sets of 64 or even higher).这有点笨拙,因此 AMD、英特尔、台积电或任何烘焙您芯片的人都将芯片融入芯片设计中,芯片将它们以 8 个为一组(对于某些工作,以 64 个或更高为一组)。 But that's where it ends.但这就是它结束的地方。 It's just bits, really.这只是一点点,真的。 Negative number?负数? What's that?那是什么? 2? 2? What is this 2 you speak of.你说的这2个是什么。 I know only 0 and 1.我只知道0和1。

So that's unwieldy too, so we humans don't wanna say: This byte holds value 00000101. We'll just say 'that holds 5'.所以这也很笨拙,所以我们人类不想说:这个字节持有值 00000101。我们只会说“持有 5”。

bits     = decimal
00000000 = 0
00000001 = 1
00000010 = 2
00000011 = 3
00000100 = 4
00000101 = 5
... and so on

That's great, but what about -1?太好了,但是 -1 呢? We just have 0 and 1 .我们只有01 There's no - so how do we do this?没有 - 那么我们如何做到这一点?

That's where it gets interesting.这就是它变得有趣的地方。 It's a convention, not something in the computer.这是一个约定,而不是计算机中的东西。 There's this thing called two's complement: We all agree to check the first bit.有一种叫做二进制补码的东西:我们都同意检查第一位。 If it is a 1, then we shall call this -X , where X is found by applying the following algorithm: Flip every bit (all zeroes become one, all ones become zeroes), and add 1 to it.如果它是 1,那么我们将称之为-X ,其中 X 是通过应用以下算法找到的:翻转每一位(所有零变为 1,所有 1 变为零),然后将其加 1。

11111011 = -5.

Why? Well, flip every bit: 00000100
then add 1 to it         : 00000101

which is 5.

But that immediately eats half of what we can represent.但这立即吃掉了我们可以代表的一半。 After all, the biggest number we can now store in a byte is 127: 01111111 , which is 127. If we add 1 to this number, then we get to 10000000 , but hey that starts with a 1 bit, so assuming we are all in agreement that this means it is negative, that means 1000000 is -128 (bit of an exotic case).毕竟,我们现在可以存储在一个字节中的最大数字是 127: 01111111 ,也就是 127。如果我们给这个数字加 1,那么我们得到10000000 ,但是嘿,它以 1 位开头,所以假设我们都是同意这意味着它是负数,这意味着1000000是 -128(有点奇怪)。

And sometimes that's annoying or not worth it.有时这很烦人或不值得。 So sometimes we all agree that the number cannot be negative at all , and 1000000 is just 128. and 11111111 is just 255.所以有时我们都同意这个数字根本不能是负数1000000只是 128。而11111111只是 255。

The computer has no idea.电脑不知道。 255 is 11111111 and so is -1. 255 是11111111 ,-1 也是。 So what's 11111111 ?那么11111111什么? The computer doesn't know.电脑不知道。 It doesn't even know what 2 is.它甚至不知道2是什么。 It just knows zeroes and ones, and as far as the computer is concerned, 11111111 is what it is.它只知道零和一,就计算机而言, 11111111就是它。 (the math works out that + and - 'just work' regardless of whether we decree these numbers are to be seen as two's complement signed or not, cool, huh? Try it! If 11111011 is both -5 as well as 251 depending on the opinion of the one reading off the number, what happens? -5 + 2 is -3. 251 + 2 is 253. -3 and 253 boil down to the same sequence of bits. Just an example. This is, incidentally, why we do the weirdo 'flip all bits and add 1' stuff. So that + and - just work and you don't need to pass along whether you consider the bits 'signed' or 'unsigned'. (数学计算得出 + 和 - '只是工作',无论我们是否规定这些数字是否被视为有符号的二进制补码,很酷,是吧?试试吧!如果11111011既是 -5 11111011是 251,取决于一个阅读关数的意见,会发生什么?-5 + 2是-3。251 + 2是253 -3253归结为位的顺序相同。只是一个例子,这是顺便说一下,为什么我们做了奇怪的“翻转所有位并加1”的东西。这样+和-就可以工作,你不需要传递你是否考虑“有符号”或“无符号”位。

In java, all numeric types except char (which is a numeric type. You'd think it represents a character, but it really doesn't) are signed.在java中,除char (它是一种数字类型。你认为它代表一个字符,但实际上不是)之外的所有数字类型都是有符号的。 byte is 'signed 8-bit number' (so, can represent from -128 to +127 , inclusive). byte是“有符号的 8 位数字”(因此,可以表示从-128+127 ,含)。 char is the only exception, that is an 'unsigned 16-bit number', so can hold from 0 to 65535 , inclusive. char 是唯一的例外,它是一个“无符号的 16 位数字”,因此可以保存065535 ,包括065535 It's just if you eg call System.out.println((char) 65);只是如果你调用System.out.println((char) 65); , the println method will interpret that number as: "Look this up in the unicode table and print whatever you find there", so that prints 'A'. , println 方法会将该数字解释为:“在 unicode 表中查找并打印您在那里找到的任何内容”,因此打印 'A'。 That's part of the source code of that particular println method, it's nothing inherent about the char type in java, which is just 'a number between 0 and 65535'.这是该特定 println 方法的源代码的一部分,它与 java 中的char类型无关,它只是“0 到 65535 之间的数字”。

So, when you print your byte array containing 0xFF, 0xFF in java, because java agreed that we consider it signed, it prints -1, -1.因此,当您在 java 中打印包含0xFF, 0xFF字节数组时,因为 java 同意我们认为它已签名,所以它会打印 -1, -1。 But that's just java-ese for 0xFF, 0xFF.但这只是 0xFF、0xFF 的 java-ese。 Your byte array contains 0xFF, 0xFF because at the bit level -1 and 255 are the exact same number .您的字节数组包含 0xFF, 0xFF 因为在位级别 -1 和 255 是完全相同的数字 For bytes anyway.无论如何,对于字节。 Not so for all the other ones (char, short, int, long).其他的(char、short、int、long)则不然。

To recap:回顾一下:

byte x = (byte) 200;
byte x = (byte) 0xC8;
byte x = -56;

In all these cases, x ends up holding the bits 11001000 .在所有这些情况下, x 最终保持位11001000 There is no way to tell the difference .没有办法区分 You can't ask the system: So, uh, is this x equal to 200, or 0xC8, or -56?你不能问系统:那么,呃,这个 x 是等于 200、0xC8 还是 -56? What was used to set it?是用什么来设置的? Because the computer does not know - the compiler translates all of the above code to the exact same end result, which is 11001000.因为计算机不知道 - 编译器将上述所有代码转换为完全相同的最终结果,即 11001000。

255 is -1. 255-1。

Well, to start you must know that in Java all integer types are signed.好吧,首先您必须知道在 Java 中所有整数类型都是有符号的。 This means that the most significant bit is reserved to represent the sign.这意味着保留最高有效位来表示符号。 That is why in Java the constant Byte.MAX_VALUE says it can go up to 127.这就是为什么在 Java 中常量Byte.MAX_VALUE说它可以达到 127。

Now, this means you can store 8 bits in a byte, but if you happen to turn on the sign bit, whatever you store would be represented by Java as negative number.现在,这意味着您可以在一个字节中存储 8 位,但是如果您碰巧打开了符号位,那么您存储的任何内容都将被 Java 表示为负数。

Since 0xff turns on all the byte bits (ie 11111111 ) instead of getting 255 as you were expecting, what you're getting is -1, because that number represents -1 in Java.由于0xff打开所有字节位(即11111111 )而不是像您期望的那样获得255 ,因此您获得的是 -1,因为该数字在 Java 中表示 -1。

Perhaps to understand it I can show you how the bits work in Java.也许为了理解它,我可以向您展示这些位在 Java 中是如何工作的。 Imagine a type called nimble of only 4 bits, where the most significant bit is reserved for sign.想象一种只有 4 位的称为灵活的类型,其中最高有效位保留用于符号。

This is how it would look in Java if it existed:如果 Java 存在的话,它会是这样的:

Imaginary Signed Type: Nimble (4 bits)

Dec. Bin.  Hex.
--------------------
+0   0000  0x0   
+1   0001  0x1
+2   0010  0x2
+3   0011  0x3
+4   0100  0x4
+5   0101  0x5
+6   0110  0x6
+7   0111  0x7
-8   1000  0x8
-7   1001  0x9
-6   1010  0xA
-5   1011  0xB
-4   1100  0xC
-3   1101  0xD
-2   1110  0xE
-1   1111  0xF

Notice how those numbers where the most significant bit is on become negative numbers.注意那些最高有效位的数字是如何变成负数的。 If this nimble was a unsigned type, then it wouldn't have negative numbers and it could reach 15.如果这个灵活是无符号类型,那么它不会有负数,它可以达到 15。

That's why Java bytes go from -128 to 127, instead of up to 255 as you were expecting.这就是为什么 Java 字节从 -128 到 127,而不是您期望的最多 255。

Now, when it comes to creating byte arrays to send to a stream, perhaps instead of creating the byte array yourself, you could wrap your socket output stream to a type-aware stream like a DataOuputStream , which allows you to send data of specific type.现在,当涉及到创建字节数组以发送到流时,也许不是自己创建字节数组,您可以将套接字输出流包装到类型感知流,如DataOuputStream ,它允许您发送特定类型的数据.

For example:例如:

try(DataOutputStream out = new DataOutpuStream(socket.getOutputStream())) {
   dOut.writeByte((byte)0xff);
   dOut.writeByte((byte)0xff);
}

That way you may avoid all the difficulties of having to create a header array.这样您就可以避免必须创建标题数组的所有困难。

But bottom line, you are array if fine.但最重要的是,如果没问题,你就是数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM