字符转换成字节？ (爪哇)

Question

How come this happens:怎么会出现这种情况：

char a = '\uffff'; //Highest value that char can take - 65535
byte b = (byte)a; //Casting a 16-bit value into 8-bit data type...! Isn't data lost here?
char c = (char)b; //Let's get the value back
int d = (int)c;
System.out.println(d); //65535... how?

Basically, I saw that a char is 16-bit.基本上，我看到一个char是 16 位的。 Therefore, if you cast it into a byte , how come no data is lost?因此，如果将其转换为byte ，为什么没有数据丢失？ (Value is the same after casting into an int) （转换为 int 后的值相同）

Thanks in advance for answering this little ignorant question of mine.提前感谢您回答我这个无知的小问题。 :P :P

EDIT: Woah, found out that my original output actually did as expected, but I just updated the code above.编辑：哇，发现我的原始输出实际上按预期执行，但我只是更新了上面的代码。 Basically, a character is cast into a byte and then cast back into a char, and its original, 2-byte value is retained.基本上，一个字符被转换为一个字节，然后被转换回一个字符，并保留其原始的 2 字节值。 How does this happen?这是怎么发生的？

Answer 1

As trojanfoe states, your confusion on the results of your code is partly due to sign-extension.正如 trojanfoe 所说，您对代码结果的混淆部分是由于符号扩展。 I'll try to add a more detailed explanation that may help with your confusion.我会尝试添加更详细的解释，这可能有助于解决您的困惑。

char a = '\uffff';
byte b = (byte)a;  // b = 0xFF

As you noted, this DOES result in the loss of information.正如您所指出的，这确实会导致信息丢失。 This is considered a narrowing conversion .这被认为是缩小转换。 Converting a char to a byte "simply discards all but the n lowest order bits".将字符转换为字节“简单地丢弃除 n 个最低位之外的所有位”。
The result is: 0xFFFF -> 0xFF结果是： 0xFFFF -> 0xFF

char c = (char)b;  // c = 0xFFFF

Converting a byte to a char is considered a special conversion .将字节转换为字符被认为是一种特殊的转换。 It actually performs TWO conversions.它实际上执行两次转换。 First, the byte is SIGN-extended (the new high order bits are copied from the old sign bit) to an int (a normal widening conversion).首先，字节被 SIGN 扩展（新的高位从旧的符号位复制）到一个 int（一个正常的扩展转换）。 Second, the int is converted to a char with a narrowing conversion.其次，将 int 转换为具有缩小转换的 char。
The result is: 0xFF -> 0xFFFFFFFF -> 0xFFFF结果是： 0xFF -> 0xFFFFFFFF -> 0xFFFF

int d = (int)c;  // d = 0x0000FFFF

Converting a char to an int is considered a widening conversion .将 char 转换为 int 被视为扩展转换。 When a char type is widened to an integral type, it is ZERO-extended (the new high order bits are set to 0).当 char 类型扩展为整型时，它是零扩展的（新的高位设置为 0）。
The result is: 0xFFFF -> 0x0000FFFF .结果是： 0xFFFF -> 0x0000FFFF 。 When printed, this will give you 65535.打印时，这将为您提供 65535。

The three links I provided are the official Java Language Specification details on primitive type conversions.我提供的三个链接是关于原始类型转换的官方 Java 语言规范详细信息。 I HIGHLY recommend you take a look.我强烈建议你看一看。 They are not terribly verbose (and in this case relatively straightforward).它们并不是非常冗长（在这种情况下相对简单）。 It details exactly what java will do behind the scenes with type conversions.它准确地详细说明了 java 在幕后将如何进行类型转换。 This is a common area of misunderstanding for many developers.这是许多开发人员普遍存在的误解。 Post a comment if you are still confused with any step.如果您仍然对任何步骤感到困惑，请发表评论。

Answer 2

It's sign extension .它的符号扩展。 Try \ሴ instead of \ and see what happens.试试\ሴ而不是\看看会发生什么。

Answer 3

java byte is signed. java byte已签名。 it's counter intuitive.这是反直觉的。 in almost all situations where a byte is used, programmers would want an unsigned byte instead.在几乎所有使用字节的情况下，程序员都希望使用无符号字节。 it's extremely likely a bug if a byte is cast to int directly.如果将字节直接转换为 int，则极有可能是一个错误。

This does the intended conversion correctly in almost all programs:这在几乎所有程序中都能正确执行预期的转换：

int c = 0xff & b ;

Empirically, the choice of signed byte is a mistake.从经验上看，有符号字节的选择是错误的。

Answer 4

Some rather strange stuff going on your machine.你的机器上发生了一些相当奇怪的事情。 Take a look at Java language specification, chapter 4.2.1 :看看Java 语言规范，第 4.2.1 章：

The values of the integral types are integers in the following ranges:整数类型的值是以下范围内的整数：

For byte, from -128 to 127, inclusive对于字节，从 -128 到 127，包括

... snip others... ...剪别人...

If your JVM is standards compliant, then your output should be -1 .如果您的 JVM 符合标准，那么您的输出应该是-1 。

字符转换成字节？ (爪哇)

问题描述

4 个解决方案

解决方案1
33 已采纳 2011-02-10 17:06:21

解决方案2
8 2011-02-10 15:03:28

解决方案3
6 2011-02-10 15:18:20

解决方案4
0 2011-02-10 15:10:50

字符转换成字节？ (爪哇)

问题描述

4 个解决方案

解决方案1 33 已采纳 2011-02-10 17:06:21

解决方案2 8 2011-02-10 15:03:28

解决方案3 6 2011-02-10 15:18:20

解决方案4 0 2011-02-10 15:10:50

解决方案1
33 已采纳 2011-02-10 17:06:21

解决方案2
8 2011-02-10 15:03:28

解决方案3
6 2011-02-10 15:18:20

解决方案4
0 2011-02-10 15:10:50