是否有可能使java.lang.invoke.MethodHandle与直接调用一样快？

Question

I'm comparing performance of MethodHandle::invoke and direct static method invokation. 我正在比较MethodHandle::invoke和直接静态方法MethodHandle::invoke性能。 Here is the static method: 这是静态方法：

public class IntSum {
    public static int sum(int a, int b){
        return a + b;
    }
}

And here is my benchmark: 这是我的基准：

@State(Scope.Benchmark)
public class MyBenchmark {

    public int first;
    public int second;
    public final MethodHandle mhh;

    @Benchmark
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @BenchmarkMode(Mode.AverageTime)
    public int directMethodCall() {
        return IntSum.sum(first, second);
    }

    @Benchmark
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @BenchmarkMode(Mode.AverageTime)
    public int finalMethodHandle() throws Throwable {
        return (int) mhh.invoke(first, second);
    }

    public MyBenchmark() {
        MethodHandle mhhh = null;

        try {
            mhhh = MethodHandles.lookup().findStatic(IntSum.class, "sum", MethodType.methodType(int.class, int.class, int.class));
        } catch (NoSuchMethodException | IllegalAccessException e) {
            e.printStackTrace();
        }

        mhh = mhhh;
    }

    @Setup
    public void setup() throws Exception {
        first = 9857893;
        second = 893274;
    }
}

I got the following result: 我得到了以下结果：

Benchmark                      Mode  Cnt  Score   Error  Units
MyBenchmark.directMethodCall   avgt    5  3.069 ± 0.077  ns/op
MyBenchmark.finalMethodHandle  avgt    5  6.234 ± 0.150  ns/op

MethodHandle has some performance degradation. MethodHandle有一些性能下降。

Running it with -prof perfasm shows this: 使用-prof perfasm运行它-prof perfasm显示：

....[Hottest Regions]...............................................................................
 31.21%   31.98%         C2, level 4  java.lang.invoke.LambdaForm$DMH::invokeStatic_II_I, version 490 (27 bytes) 
 26.57%   28.02%         C2, level 4  org.sample.generated.MyBenchmark_finalMethodHandle_jmhTest::finalMethodHandle_avgt_jmhStub, version 514 (84 bytes) 
 20.98%   28.15%         C2, level 4  org.openjdk.jmh.infra.Blackhole::consume, version 497 (44 bytes)

As far as I could figure out the reason for the benchmark result is that the Hottest Region 2 org.sample.generated.MyBenchmark_finalMethodHandle_jmhTest::finalMethodHandle_avgt_jmhStub contains all the type-checks performed by the MethodHandle::invoke inside the JHM loop. 据我所知，基准测试结果的原因是最热区域2 org.sample.generated.MyBenchmark_finalMethodHandle_jmhTest::finalMethodHandle_avgt_jmhStub包含由JHM循环内的MethodHandle::invoke执行的所有类型检查。 Assembly output fragment (some code ommitted): 汇编输出片段（省略一些代码）：

....[Hottest Region 2]..............................................................................
C2, level 4, org.sample.generated.MyBenchmark_finalMethodHandle_jmhTest::finalMethodHandle_avgt_jmhStub, version 519 (84 bytes) 
;...
0x00007fa2112119b0: mov     0x60(%rsp),%r10
;...
0x00007fa2112119d4: mov     0x14(%r12,%r11,8),%r8d  ;*getfield form
0x00007fa2112119d9: mov     0x1c(%r12,%r8,8),%r10d  ;*getfield customized
0x00007fa2112119de: test    %r10d,%r10d
0x00007fa2112119e1: je      0x7fa211211a65    ;*ifnonnull
0x00007fa2112119e7: lea     (%r12,%r11,8),%rsi
0x00007fa2112119eb: callq   0x7fa211046020    ;*invokevirtual invokeBasic
;...
0x00007fa211211a01: movzbl  0x94(%r10),%r10d  ;*getfield isDone
;...
0x00007fa211211a13: test    %r10d,%r10d
;jumping at the begging of jmh loop if not done
0x00007fa211211a16: je      0x7fa2112119b0    ;*aload_1 
;...

Before calling the invokeBasic we perform the type-checking (inside the jmh loop) which affects the output avgt. 在调用invokeBasic之前，我们执行类型检查（在jmh循环内），这会影响输出avgt。

QUESTION: Why isn't all the type-check moved outside of the loop? 问题：为什么不是所有的类型检查都移出循环之外？ I declared public final MethodHandle mhh; 我宣布public final MethodHandle mhh; inside the benchmark. 在基准内部。 So I expected the compiler can figured it out and eliminate the same type-checks. 所以我希望编译器可以解决它并消除相同的类型检查。 How to make the same typechecks eliminated? 如何消除相同的类型检查？ Is it possible? 可能吗？

Answer 1

You use reflective invocation of MethodHandle . 您使用MethodHandle 反射调用。 It works roughly like Method.invoke , but with less run-time checks and without boxing/unboxing. 它大致类似于Method.invoke ，但运行时检查较少，没有装箱/拆箱。 Since this MethodHandle is not static final , JVM does not treat it as constant, that is, MethodHandle's target is a black box and cannot be inlined. 由于此MethodHandle不是static final ，因此JVM不会将其视为常量，也就是说，MethodHandle的目标是一个黑盒子，不能内联。

Even though mhh is final, it contains instance fields like MethodType type and LambdaForm form that are reloaded on each iteration. 尽管mhh是final，但它包含在每次迭代时重新加载的MethodType type和LambdaForm form实例字段。 These loads are not hoisted out of the loop because of a black-box call inside (see above). 由于内部有黑框调用，因此这些负载不会从循环中提升（参见上文）。 Furthermore, LambdaForm of a MethodHandle can be changed (customized) in run-time between calls, so it needs to be reloaded. 此外， LambdaForm的MethodHandle可以在调用之间的运行时更改（自定义），因此需要重新加载。

How to make the call faster? 如何更快地打电话？

Use static final MethodHandle. 使用static final MethodHandle。 JIT will know the target of such MethodHandle and thus may inline it at the call site. JIT将知道此类MethodHandle的目标，因此可以在呼叫站点内联它。
Even if you have non-static MethodHandle, you may bind it to a static CallSite and invoke it as fast as direct methods. 即使你有非静态的MethodHandle，你也可以将它绑定到静态CallSite，并像直接方法一样快地调用它。 This is similar to how lambdas are called. 这类似于lambdas的调用方式。
```
 private static final MutableCallSite callSite = new MutableCallSite( MethodType.methodType(int.class, int.class, int.class)); private static final MethodHandle invoker = callSite.dynamicInvoker(); public MethodHandle mh; public MyBenchmark() { mh = ...; callSite.setTarget(mh); } @Benchmark public int boundMethodHandle() throws Throwable { return (int) invoker.invokeExact(first, second); } 
```
1. Use regular invokeinterface instead of MethodHandle.invoke as @Holger suggested. 使用常规的invokeinterface代替MethodHandle.invoke作为@Holger建议。 An instance of interface for calling given MethodHandle can be generated with LambdaMetafactory.metafactory() . 可以使用LambdaMetafactory.metafactory()生成用于调用给定MethodHandle的接口实例。

Answer 2

Make MethodHandle mhh static: Make MethodHandle mhh static：

Benchmark            Mode  Samples  Score   Error  Units
directMethodCall     avgt        5  0,942 ± 0,095  ns/op
finalMethodHandle    avgt        5  0,906 ± 0,078  ns/op

Non-static: 非静态的：

Benchmark            Mode  Samples  Score   Error  Units
directMethodCall     avgt        5  0,897 ± 0,059  ns/op
finalMethodHandle    avgt        5  4,041 ± 0,463  ns/op

是否有可能使java.lang.invoke.MethodHandle与直接调用一样快？

问题描述

2 个解决方案

解决方案1
9 已采纳 2018-03-15 15:39:40

How to make the call faster? 如何更快地打电话？

解决方案2
4 2018-03-15 05:44:13

是否有可能使java.lang.invoke.MethodHandle与直接调用一样快？

问题描述

2 个解决方案

解决方案1 9 已采纳 2018-03-15 15:39:40

How to make the call faster? 如何更快地打电话？

解决方案2 4 2018-03-15 05:44:13

解决方案1
9 已采纳 2018-03-15 15:39:40

解决方案2
4 2018-03-15 05:44:13