简体繁体 English

为什么字节码不是人类可读的？

[英]Why is Bytecode not human-readable?

原文 2020-07-04 17:59:30 1 2 java/ jvm

I'm confused about a certain topic:我对某个主题感到困惑：

When you compile Java or Python, you get bytecode which will run on the respective VMs.当你编译 Java 或 Python 时，你会得到将在各自的 VM 上运行的字节码。 In a previous question I had asked why, when you open a.pyc or.class file in a text editor, it appears as gibberish and not like readable bytecode (LOAD, STORE operations etc).在上一个问题中，我问过为什么，当您在文本编辑器中打开 a.pyc 或 .class 文件时，它看起来像乱码，而不像可读字节码（加载、存储操作等）。

Now the answer I got at the time based around the argument of "That's like saying if you opened an.exe file and expected to see x86 assembly" and they made the analogy that bytecode that I've seen is the "assembly" version of the real bytecode which is not readable.现在我当时得到的答案是基于“这就像说如果你打开一个 .exe 文件并期望看到 x86 程序集”的论点，他们将我看到的字节码类比为“程序集”版本不可读的真实字节码。

This would be okay and make sense if not for one thing.如果不是为了一件事，这将是可以的并且是有意义的。 You can't compare an exe file to a bytecode file.您无法将 exe 文件与字节码文件进行比较。 An exe file is ALREADY compiled to machine code.一个 exe 文件已经被编译成机器码。 A bytecode file is NOT.字节码文件不是。 A bytecode file is fed to a VM which then interprets it (usually with JIT).字节码文件被馈送到 VM 中，然后 VM 对其进行解释（通常使用 JIT）。

That means that whoever wrote the JVM for instance, (which is just a piece of software itself), would need to write a bytecode-interpreter.这意味着，例如，编写 JVM（这只是一个软件本身）的人都需要编写一个字节码解释器。 And I really doubt they wrote an interpreter to handle the following:我真的怀疑他们写了一个解释器来处理以下问题：

Java.class file: Java.class文件：

I could be wrong and maybe they DID write an interpreter to handle this form of bytecode for some odd reason, but it doesn't seem likely.我可能是错的，也许他们出于某种奇怪的原因编写了一个解释器来处理这种形式的字节码，但这似乎不太可能。 However, if the JVM handles the "assembly" version of the bytecode, then that would mean the cycle is但是，如果 JVM 处理字节码的“汇编”版本，那么这意味着循环是

.java ->.class (unreadable) ->.class (readable right as it enters the JVM) There's almost a meaningless step in between. .java ->.class（不可读）->.class（进入JVM时可读）中间几乎没有意义的步骤。

I'm just really confused at this point.在这一点上我真的很困惑。

2 个解决方案

They did write an interpreter for this form of bytecode.他们确实为这种形式的字节码编写了一个解释器。 They read it as bytes, of course, not ASCII characters, which makes it more usable.他们将其读取为字节，当然不是 ASCII 字符，这使得它更有用。 But, for example, each instruction code takes only one byte, not eg five to write store .但是，例如，每个指令代码只占用一个字节，而不是例如五个来写入store 。

The goal was to have something compact in memory usage, but not actually compiled to machine code that would be specific to only one device.目标是在 memory 使用中具有紧凑的内容，但实际上并未编译为仅特定于一个设备的机器代码。 Java bytecode is more or less its own form of machine code. Java 字节码或多或少是它自己的机器码形式。

If you would like to read it, however, use the javap command to decompile it to a more readable form.但是，如果您想阅读它，请使用javap命令将其反编译为更易读的形式。

Bytecode is the "machine code" for a virtual machine.字节码是虚拟机的“机器码”。 As such, it has much the same goals and restrictions as "real" machine code - compact, efficient decoding, etc.因此，它与“真正的”机器代码具有相同的目标和限制——紧凑、高效的解码等。

The fact that bytecode is executed by a virtual machine rather than by a "real" machines is not particularly relevant.字节码由虚拟机而不是由“真实”机器执行这一事实并不特别相关。