简体繁体 English

Java的变量如何实际存储在memory中？

[英]How is a variable actually stored in memory for Java?

原文 2020-05-19 22:00:17 4 1 java/ file/ memory/ io/ binary

I'm learning about Text I/O and Binary I/O in java right now.我现在正在学习 java 中的文本 I/O 和二进制 I/O。 I read that each value that you write to a file is initially stored in binary.我读到您写入文件的每个值最初都以二进制形式存储。 For text I/O, the individual digits are converted to it's corresponding Unicode values and then encoded to the file-specific encoding such as ASCII.对于文本 I/O，单个数字将转换为其对应的 Unicode 值，然后编码为文件特定的编码，例如 ASCII。 For binary I/O, the binary value is directly represented in the file.对于二进制 I/O，二进制值直接在文件中表示。 For example, 199 would be represented as 0xC7 which in binary is 11000111. Now I'm confused on one part.例如，199 将表示为 0xC7，二进制为 11000111。现在我在一方面感到困惑。 If a variable is initially stored as a binary format, does each digit represent a separate byte that is stored or is the entirety of the number stored as a single byte.如果变量最初以二进制格式存储，那么每个数字是表示存储的单独字节还是存储为单个字节的整个数字。 For example, is 199 originally stored as 0xc7 which would be 11000111 in binary?例如，199 是否最初存储为 0xc7，它在二进制中是 11000111？ Or would it be stored in 3 bytes with each byte representing the binary value for the digit.还是将其存储在 3 个字节中，每个字节代表该数字的二进制值。 If it was stored in 3 separate bytes, does binary I/O convert that 3 byte number to a single byte?如果它存储在 3 个单独的字节中，二进制 I/O 是否将该 3 字节数转换为单个字节？ If it's stored in a single byte, how does text I/O translate that single byte into 3 separate byte values.如果它存储在单个字节中，文本 I/O 如何将该单个字节转换为 3 个单独的字节值。 I'm just confused on how to word this.我只是对如何措辞感到困惑。 Hope you can understand what I'm getting at.希望你能明白我在说什么。 Thanks谢谢

1 个解决方案

The only thing which a computer is capable of dealing with are sets of 0/1 bits which are stored in memory or, if you wish on a storage device.计算机唯一能够处理的是存储在 memory 中的 0/1 位集，或者，如果您希望存储在存储设备中。 Those bits can be streamed to monitors and converted to characters by graphical hardware.这些位可以通过图形硬件流式传输到监视器并转换为字符。 Sams story with keyboards, you type a key and a few bits of data will be send to the computer.使用键盘的 Sams 故事，您键入一个键，一些数据将被发送到计算机。

Bits are stored in memory and are accessible by memory addresses.位存储在 memory 中，可通过 memory 地址访问。 The addresses are also sets of bits.地址也是一组位。

For practical reasons the bits are grouped into bytes, words, long words, ... A byte used to be the smallest addressable unit of bits and historically ended up as a group of 8 bits, which is currently used in most of the hardware.出于实际原因，这些位被分组为字节、字、长字……一个字节曾经是最小的可寻址位单元，并且在历史上最终以一组 8 位结束，目前在大多数硬件中使用。 Modern memory can store data in multiple byte addressable chunks.现代 memory 可以将数据存储在多字节可寻址块中。 Same for the disk, you store data there, using specific addressing mechanisms.磁盘也是如此，您使用特定的寻址机制将数据存储在那里。 But in any case those are just sets of bits.但无论如何，这些只是一组位。

What you are confused about is the interpretation of those bits.您对这些位的解释感到困惑。 They can represent integer numbers, floating point numbers, characters, addresses, ... The way they are interpreted only depends on the program which uses them.它们可以表示 integer 数字、浮点数、字符、地址……它们的解释方式仅取决于使用它们的程序。

Characters do not exist in the computer.计算机中不存在字符。 They are just an abstraction which is provided by programming languages.它们只是编程语言提供的抽象。 The programs interpret the bits stored on the computer.程序解释存储在计算机上的位。 There are standards.有标准。 For example the ASCII encoding maps English characters plus a few special characters into numbers from 0 to 127. Those fit into a single byte (leaving number 128 to 255 for special use).例如，ASCII 编码将英文字符和一些特殊字符映射为从 0 到 127 的数字。这些数字适合单个字节（保留数字 128 到 255 以供特殊用途）。 A print command will read those bytes one by one and send them to graphics to form letters on the screen as specified in the encoding standard.打印命令将一一读取这些字节并将它们发送到图形以在屏幕上形成编码标准中指定的字母。 Different encoding scheme will display the same bytes differently.不同的编码方案会以不同的方式显示相同的字节。

If you write a program wit the "hello world" sting in it, the program will convert the symbols between quotes into a set of 11 ascii bytes.如果您编写一个带有“hello world”字符串的程序，该程序会将引号之间的符号转换为一组 11 个 ascii 字节。 (In 'c' it will add yet another byte which is equal to '0' and ends the string this way). （在“c”中，它将添加另一个等于“0”的字节并以这种方式结束字符串）。 Unicode is yet another way to represent characters. Unicode 是另一种表示字符的方式。 Every unicode character is represented by multiple bytes of data.每个 unicode 字符由多个字节的数据表示。 There are other schemes as well.还有其他方案。 One thing to pay attention to.需要注意的一件事。 If you write strings on the disk using certain encoding, you should read them with the same encoding, or your prints will give you garbage.如果你使用某种编码在磁盘上写字符串，你应该用相同的编码来读取它们，否则你的打印会给你垃圾。 But you can always read and copy then as binary data without interpretation.但是您始终可以将其作为二进制数据读取和复制而无需解释。

So, any variable of any type is just an abstraction and always consists of bytes of data which your program knows how to interpret based on the data type and/or operations it wants to perform.因此，任何类型的任何变量都只是一种抽象，并且总是由数据字节组成，您的程序知道如何根据数据类型和/或它想要执行的操作来解释这些数据字节。 Variables of type int, double, any java object, including String, are just sets of bytes of different sizes. int、double、any java object 类型的变量，包括字符串，只是不同大小的字节集。 Only the program (and java interpreter is a program) knows what to do with them, use them in calculations or display as characters.只有程序（和 java 解释器是程序）知道如何处理它们，在计算中使用它们或显示为字符。