简体   繁体   中英

Variables stored on heap vs. stack in a non-fully compiled language (e.g. Java)?

I'm learning Java and reading how primitives (defined in methods) are stored on "the stack," vs. other things which are stored on "the heap."

But, Java is not a fully compiled to executable language, so what does it mean for things to be stored on "the stack"?

I would think that the JVM would, while reading the bytecode, would have to get storage for everything using malloc/new/etc.

Same goes for languages like Python (though I have not read anywhere that Python stores variables on stack, so no confusion for me). Since these languages are interpreted, the interpreter would, when encountering a variable definition, have to dynamically allocate memory for it, right?

The language is just an abstraction. Any implementation is allowed as long as it provides the results dictated by the language specification.

When someone says that primitives are stored on the stack and objects are stored on the heap, what they really mean is that that is a natural way to implement an interpreter. In practice, you'll most likely be using a JIT, in which case objects can sometimes be stored on the stack as well. But this is all abstracted away implementation details, so you shouldn't have to care about it. If you do, you need to find out how the particular VM you are using works.

Java is not a fully compiled to executable language, so what does it mean for things to be stored on "the stack"?

Not at the first step. but when you start running your Java program the JIT compiles the code into machine langauge. if not, you could not have run any Java program. any program should turn or use existing machine code in order to run.

I would think that the JVM would, while reading the bytecode, would have to get storage for everything using malloc/new/etc.

In order to allocate data on the stack, you (usually) move the stack pointer forward or backward (depends on the stack architecture). for example on MASM syntax , in order to allocate 1 integer with the size of 4 bytes you subtract 4 bytes from the stack pointer:

sub esp,4 //sub = subtract , esp = extended stack pointer

why am I telling you all this? because when the JIT sees something like

int x; //or intermidiate language equivilant

it can transform it to

sub esp,4

hence allocate the integer on the stack.

But I think I recognize where the confusion comes from.
Both stack and heap allocation are done on run time

the only exception that in C (and C++) stack allocation size is static - the size is determined on compile time , where dynamic allocation size is determined (or can be changed) on run-time.
The JIT compiles the code on run time, but it hard-code the size of the stack allocation into the assebly code, thus the size is "static".

A stack is just a memory region managed in a certain way. The specification doesn't require a particular allocation strategy, but in the end, the JVM always has to allocate the required memory in either way, regardless of whether the code to be executed has been compiled or gets interpreted.

This isn't different to programs developed in a programming language that compiles directly to native code. Programs still have to allocate memory for stacks though it may happen behind the scenes (well, in Java it also happens behind the scenes from an application programmers point of view).

But it seems, you have a wrong idea of the stack anyway. Most modern programming languages, including Java, organize a stack in frames . A frame is capable of holding all local variables and the deepest operand stack that you may encounter within a method. The frame for a method is allocated right at the method's entry and no further allocations are performed in the course of the execution or interpretation of the method's byte code.

Or, in other words, Java's bytecode instruction set doesn't have such a thing as a “variable definition” to process. There are only instructions for transferring items between the local variables (addressed by index) and the operand stack or between the operand stack and the heap. The existence of a local variable is implied by what has been written into it. There are optional debugging information hinting at what variables should exist at which code location, but these information won't get processed during normal execution.

Depending on the JVM implementation, each thread may have a pre-allocated memory storage of a fixed maximum stack size in which stack frames get placed. In these implementations, no allocation in the sense of operating system operations is performed for the stack during the life of a thread. A lot of native code follows the same model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM