JNA Native.setXXX() 慢

Question

Not sure if this is the right place to ask.不确定这是否是问的正确地方。 I noticed in YourKit (but any other profiler will do) huge contributions by Native.setShort, in my case.我在 YourKit（但任何其他分析器都可以）中注意到 Native.setShort 的巨大贡献，就我而言。 This sets fields in a Structure in order to fill jna library call arguments.这会在 Structure 中设置字段以填充 jna 库调用参数。 SetShort is called some 10 levels underneath the jna lib proxy call. SetShort 在 jna lib 代理调用下大约 10 个级别被调用。 The actual function call to windows' kernel32.dll does not appear in the sampling at all.对windows 的kernel32.dll 的实际函数调用根本没有出现在采样中。 Neither does any Structure.read activity when returning the values.返回值时也没有任何 Structure.read 活动。

Now, I looked at what this and the other primitive value setters do: they take the address of the argument and move sizeof( argument) bytes to the target adress by memcpy or bcopy, possibly surrounded by try/catch macros.现在，我查看了 this 和其他原始值 setter 的作用：它们获取参数的地址并通过 memcpy 或 bcopy 将 sizeof(argument) 字节移动到目标地址，可能被 try/catch 宏包围。 Why is done?为什么完成？ Why not just something like:为什么不只是这样的：

*((short*) target) = value

Would this be more efficient or is the try/catch important here?这会更有效还是 try/catch 在这里很重要？ The surrounding PSTART/PEND macros seem to not always generate a try/catch.周围的 PSTART/PEND 宏似乎并不总是生成 try/catch。 This is the most recent JNA grab 4.2.0 from git.这是最新的 JNA 从 git 中抓取 4.2.0。

Update: It looks like this was the profiler playing a practical joke on me.更新：看起来这是分析器对我开了个恶作剧。 Today I saw the wasted time more evenly spread among other call stack levels from and to the actual native call.今天，我看到浪费的时间更均匀地分布在来自和到实际本机调用的其他调用堆栈级别。

My solution was to use JNA direct mapping: Add another function in my own DLL on top of the OS API, that takes primitive pointers instead of a Structure.ByReference to return values.我的解决方案是使用 JNA 直接映射：在 OS API 之上的我自己的 DLL 中添加另一个函数，该函数采用原始指针而不是Structure.ByReference来返回值。 Calling this function with a one-element primitive array for each return parameter took 370 ns vs 1500 ns (exluding new Structure.ByReference() ).为每个返回参数使用一个元素的原始数组调用此函数需要 370 ns 与 1500 ns（不包括new Structure.ByReference() ）。

So, in the end, Native.setXXX() methods really are slow, together with all the glue code around JNA method invocations.所以，最后，Native.setXXX() 方法和围绕 JNA 方法调用的所有胶水代码确实很慢。 But JNA direct mapping does it.但是 JNA 直接映射做到了。 I have never tested actual JNI calls, so cannot compare timings here.我从未测试过实际的 JNI 调用，因此无法在这里比较时间。

Answer 1

On *nix systems, PSTART / PEND do a setjmp / longjmp to trap a range of memory faults (but only if Native.setProtected(true) ).在 *nix 系统上， PSTART / PEND执行setjmp / longjmp以捕获一系列内存故障（但仅限于Native.setProtected(true) ）。 On Windows, it uses structured exception handling (basically try / catch ) to do the same thing, and is on by default.在 Windows 上，它使用结构化异常处理（基本上是try / catch ）来做同样的事情，默认情况下是打开的。

Even when enabled, it's unlikely they would add much overhead compared to the overhead of the JNI transition itself (going from Java to C or vice versa).即使启用，与 JNI 转换本身的开销（从 Java 到 C 或反之亦然）相比，它们也不太可能增加太多开销。

Generally, passing a Structure to native code is not terribly efficient, since the automatic writes and reads depend heavily on reflection to copy Java fields into native memory.通常，将Structure传递给本机代码的效率并不高，因为自动写入和读取在很大程度上依赖于将 Java 字段复制到本机内存的反射。 In most cases, though, it doesn't really matter.不过，在大多数情况下，这并不重要。

For those few cases where you find a bottleneck, you'd want to use JNA's direct mapping and restrict yourself to primitive arguments.对于发现瓶颈的少数情况，您可能希望使用 JNA 的直接映射并将自己限制在原始参数上。

JNA Native.setXXX() 慢

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-10-07 21:50:51

JNA Native.setXXX() 慢

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-10-07 21:50:51

解决方案1
2 已采纳 2015-10-07 21:50:51