简体   繁体   English

为什么Java和Javascript中的字节数组表示形式不同?

[英]Why different byte array representation in Java and Javascript?

I was trying to see the UTF-8 bytes of 👍 in both Java and Javascript. 我试图同时在Java和Javascript中看到U的UTF-8字节。

In Javascript, 在Javascript中,

new TextEncoder().encode("👍"); returns => [240, 159, 145, 141] 返回=> [240, 159, 145, 141]

while in Java, 在Java中

"👍".getBytes("UTF-8") returns => [-16, -97, -111, -115] "👍".getBytes("UTF-8")返回=> [-16, -97, -111, -115]

I converted those byte arrays to hex string using methods I found corresponding to the language ( JS , Java ) and both returned F09F918D 我使用与语言( JSJava )相对应的方法将这些字节数组转换为十六进制字符串,并且均返回F09F918D

In fact, -16 & 0xFF gives => 240 实际上, -16 & 0xFF = = 240

I am curious to know more on why both language chooses different ways of representing byte arrays. 我很好奇有关为什么两种语言选择不同的方式表示字节数组的信息。 It took me a while to figure out up to this. 我花了一段时间才弄清楚这一点。

In Java all bytes are signed. 在Java中,所有字节均已签名。 Therefore, the range of one byte is from -128 to 127. In Javascript though, the returned values are, well, simply speaking integers. 因此,一个字节的范围是从-128到127。但是,在Javascript中,返回的值只是简单地讲整数。 So it can be represented in decimal using the full range up to 255. 因此可以使用小数点表示,最大范围为255。

Therefore, if you convert both result to 1 byte hexadecimal representation - those would be the same: F0 9F 91 8D . 因此,如果将两个结果都转换为1字节的十六进制表示形式,则它们将是相同的: F0 9F 91 8D

Speaking of why java decided to eliminate unsigned types, that is a separate discussion . 说到Java 为什么决定消除无符号类型,那是另外一个讨论

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM