简体   繁体   中英

Compile-Time By-Reference Parameters on the JVM

Currently developing on a custom programming language on the JVM, I would like the language to support by-reference parameters in methods. How would I go about doing that? So far, I was able to come up with three different ways to accomplish this.

  1. Wrapper Objects

The idea behind this is to create a wrapper object that is created containing the current value of the field, passed to the by-ref method call, and then unboxed after the call. This is a fairly straight-forward way to do this, but requires a lot of 'garbage' objects that are created and immediately discarded.

  1. Arrays

Simply create an array of the type with 1 element, put field value in the array, call the method passing the array and finally assign the field from the array. The nice thing about this is that it ensures runtime type-safety, other than a generic wrapper class which would require additional casts.

  1. Unsafe

This one is slightly more advanced: Use sun.misc.Unsafe to allocate some native memory space, store the field value on that memory, call the method and pass the address, re-assign the field from the native memory address, and free it up again.

Bonus : Is it possible to implement pointers and pointer arithmetic using the Unsafe class?

Wrapper Objects [...] but requires a lot of 'garbage' objects that are created and immediately discarded.

If the lifetime of such a wrapper is limited to a callsite (+ inlined callee) then the compiler may be able to prove that through escape analysis and avoid the allocation by decomposing the wrapper object into its primitive members and use them directly in the generated code.

That essentially requires that those reference-wrappers are never stored to fields and only passed as method arguments

Unsafe Use sun.misc.Unsafe to allocate some native memory space, store the field value on that memory

You cannot store object-references in native memory. The garbage collector would not know about it and thus could change the memory address under your feet or GC the object if that is your only reference.

But since you're creating your own language you could simply desugar field references into object references + an offset. Ie pass two parameters (object ref + long offset) instead of one. If you know the offset you can use Unsafe to manipulate the field.

Obviously this will only work for object fields. Local references cannot be changed this way.

Bonus: Is it possible to implement pointers and pointer arithmetic using the Unsafe class?

Yes for unmanaged memory.

For memory within the managed heap you are only allowed to point to objects themselves and do pointer arithmetic relative to the object header.
And you always must store object references in Object -typed fields. Storing them in a long would lead to GC-implementations (precise ones at least) missing the reference.


Edit: You might also be interested in ongoing work in the JDK regarding VarHandles . It's something you probably want to keep in mind when developing your language.

It's seems you have missed an important point about the pass-by-reference concept: whenever a write into the reference happens, the referenced variable will be updated. This is different to any concept like yours that will actually pass a copy in a holder and update the original variable upon method return.

You can notice the difference even in single-threaded use case:

foo(myField, ()-> {
    // if myField is pass-by-reference, whenever foo() modifies
    // it and calls this Runnable, it should see the new value:
    System.out.println(myField);
});

Of course, you could make both references accessing the same wrapper, but for an environment allowing (almost) arbitrary code, it would imply that you would have to replace every reference to the field (in the end, change the contents of the field) to the wrapper.


So if you want to implement a clean, real pass-by-value mechanism within the JVM, it must be able to modify the referenced artifact, ie field or array slot. For local variables, there is no way to do it so there's no way around replacing local variables with a holder object once a reference to it has been created.

So the kind of options is already known, you can pass a java.lang.reflect.Field (does not work with array slots), a pair of java.lang.invoke.MethodHandle or an arbitrary typed object (of a generated type) offering read and write access.

When implementing this reference accessor type, you can resort to Unsafe to create an anonymous class just like Java's lambda expression facility does. If fact, you can steal inspire yourself a lot from the lambda expression mechanism:

  • put an invokedynamic instruction at the place where a reference has to be created, pointing to your factory method and providing a handle to the field or array slot
  • Let the factory analyze the handle and dynamically create the accessor implementation, the main difference being that your type will have two operations, read and write
  • Use Unsafe to create that class (which might access the field, even if its private )
  • If the field is static, create an instance and return a CallSite with a handle returning that instance
  • Otherwise return a CallSite with a handle pointing to the constructor of the accessor class accepting an object instance or an array

This way you will only have an overhead at the first-time usage while subsequent uses will either use singleton in the case of static fields or construct an accessor on-the-fly for instance fields and array slots. These accessor instance creation can be elided by HotSpots escape analysis if used frequently just like with ordinary objects.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM