如何加快运行时 Java 代码检测？

Question

I made a Java agent which is attached to a JVM during runtime and instruments all the loaded project classes and inserts some logging statements.我制作了一个 Java 代理，它在运行时附加到 JVM 并检测所有加载的项目类并插入一些日志记录语句。 There are 11k classes in total.总共有 11k 个班级。 I measured the total time taken by the transform method of my ClassFileTransformer and it was 3 seconds.我测量了我的ClassFileTransformer的transform方法所花费的总时间，它是 3 秒。 But the duration of the whole instrumentation process takes about 30 seconds.但是整个检测过程的持续时间大约需要 30 秒。 This is how I retransform my classes:这就是我重新转换课程的方式：

 instrumentation.retransformClasses(myClassesArray);

I assume most time is taken up by the JVM to reload changed classes.我假设 JVM 占用了大部分时间来重新加载更改的类。 Is that right?那正确吗？ How can I speed up the instrumentation process?如何加快检测过程？

Update :更新：
When my agent is attached,当我的代理人被附加时，

instrumentation.addTransformer(new MyTransfomer(), true);
instrumentation.retransformClasses(retransformClassArray);

is called only once .只调用一次。

Then MyTransfomer class instruments the classes and measures the total duration time of instrumentation:然后MyTransfomer class 检测类并测量检测的总持续时间：


public class MyTransfomer implements ClassFileTransformer {
private long total = 0;
private long min = ..., max = ...;

public final byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classFileBuffer) {
   long s = System.currentTimeMillis();
   if(s < min) min = s;
   if(s > max) max = s;
   byte[] transformed = this.transformInner(loader, className, classFileBuffer);

   this.total += System.currentTimeMillis() - s;
   
   return transformed;
  }
}

After all the classes are instrumented(from the initial array) (a global cache keeps track of the instrumented classes) total is printed and it will be ~3 seconds.在检测所有类之后（从初始数组）（全局缓存跟踪检测的类） total打印出来，大约是 3 秒。 But max-min is ~30 seconds.但max-min约为 30 秒。

Update 2:更新 2：

After looking at the stack trace this is what happens: I call查看堆栈跟踪后会发生以下情况：我调用

instrumentation.retransformClasses(retransformClassArray);

which calls the native method retransformClasses0() .它调用本机方法retransformClasses0() 。 After some time(!) the JVM calls the transform() method of the sun.instrument.InstrumentationImpl class(but this method takes only one class at a time, so the JVM calls this method multiple times consecutively), which calls transform() on the sun.instrument.TransformerManager object which has a list with the all the ClassTransformers registered and calls each of these transformers to transform the class( I have only one transformer registered!! ). After some time(!) the JVM calls the transform() method of the sun.instrument.InstrumentationImpl class(but this method takes only one class at a time, so the JVM calls this method multiple times consecutively), which calls transform()在sun.instrument.TransformerManager object 上，其中列出了所有已注册的ClassTransformers并调用每个转换器来转换类（我只注册了一个转换器！！ ）。

So to my opinion, most of the time is spent in the JVM (after retransformClasses0() is called and before each call to sun.instrument.InstrumentationImpl.transform() ).所以在我看来，大部分时间都花在了 JVM 上（在调用retransformClasses0()之后和每次调用sun.instrument.InstrumentationImpl.transform()之前）。 Is there a way to reduce the time needed by the JVM to carry out this task?有没有办法减少 JVM 执行此任务所需的时间？

Answer 1

Correction:更正：

Because the retransformClasses(classArr) will not retransform all the elements in the classArr at once, instead it will retransform each of them as needed(eg. while linking).(refer to the jdk [ VM_RedefineClasses ][1] and [ jvmtiEnv ][2])因为retransformClasses(classArr)不会立即重新转换classArr中的所有元素，而是会根据需要重新转换每个元素（例如，在链接时）。（请参阅 jdk [ VM_RedefineClasses ][1] 和 [ jvmtiEnv ][ 2]) , it does retransform all of them at once. , 它确实一次重新转换所有这些。

What retransformClasses() does: retransformClasses() 的作用：

Transfer control to native layer, and give it a class list which we want to transform将控制权转移到原生层，并给它一个我们想要转换的 class 列表
For every class to be transformed, the native code tries to get a new version by calling our java transformer, this leads to a transfer of control between the java code and native.对于每个要转换的 class，本机代码会尝试通过调用我们的 java 转换器来获取新版本，这会导致 java 代码和本机代码之间的控制转移。
The native code replace the appropriate parts of internal representation by the given new class version one another.本机代码将内部表示的适当部分替换为给定的新 class 版本。

In step 1:在第 1 步中：

java.lang.instrument.Instrumentation#retransformClasses calls sun.instrument.InstrumentationImpl#retransformClasses0 which is a JNI method, the control will be transferred to native layer. java.lang.instrument.Instrumentation#retransformClasses调用sun.instrument.InstrumentationImpl#retransformClasses0是一个 JNI 方法，控制将转移到原生层。

// src/hotspot/share/prims/jvmtiEnv.cpp
jvmtiError
JvmtiEnv::RetransformClasses(jint class_count, const jclass* classes) {
  ...
  VM_RedefineClasses op(class_count, class_definitions, jvmti_class_load_kind_retransform);
  VMThread::execute(&op);
  ...
} /* end RetransformClasses */

In step 2:在第 2 步中：

This step is implemented by KlassFactory::create_from_stream , this procedure will post a ClassFileLoadHook event whose callback can acquire the transformed bytecode by invoking the java transformer method.此步骤由KlassFactory::create_from_stream实现，该过程将发布一个ClassFileLoadHook事件，该事件的回调可以通过调用 java 转换器方法获取转换后的字节码。 In this step, the control will switch back and forth between native code and java code.在此步骤中，控件将在本机代码和 java 代码之间来回切换。

// src/hotspot/share/classfile/klassFactory.cpp
// check and post a ClassFileLoadHook event before loading a class
// Skip this processing for VM hidden or anonymous classes
if (!cl_info.is_hidden() && (cl_info.unsafe_anonymous_host() == NULL)) {
  stream = check_class_file_load_hook(stream,
                                      name,
                                      loader_data,
                                      cl_info.protection_domain(),
                                      &cached_class_file,
                                      CHECK_NULL);
}

//src/java.instrument/share/native/libinstrument/JPLISAgent.c :
//call java code sun.instrument.InstrumentationImpl#transform
transformedBufferObject = (*jnienv)->CallObjectMethod(
   jnienv,
   agent->mInstrumentationImpl, //sun.instrument.InstrumentationImpl
   agent->mTransform, //transform
   moduleObject,
   loaderObject,
   classNameStringObject,
   classBeingRedefined,
   protectionDomain,
   classFileBufferObject,
   is_retransformer);

In step 3:在第 3 步中：

VM_RedefineClasses::redefine_single_class(jclass the_jclass, InstanceKlass* scratch_class, TRAPS) method replaces parts (such as constant pool, methods, etc.) in target class with parts from transformed class. VM_RedefineClasses::redefine_single_class(jclass the_jclass, InstanceKlass* scratch_class, TRAPS)方法将目标 class 中的部分（例如常量池、方法等）替换为来自转换后的 class 的部分。

// src/hotspot/share/prims/jvmtiRedefineClasses.cpp
for (int i = 0; i < _class_count; i++) {
  redefine_single_class(_class_defs[i].klass, _scratch_classes[i], thread);
}

So how to speed up runtime Java code instrumentation?那么如何加快运行时 Java 代码检测呢？

In my project, the total time and max-min time are almost the same if the app is in a paused state while transforming.在我的项目中，如果应用程序在转换时处于暂停的 state 中，则total时间和max-min时间几乎相同。 can you provide some demo code?你能提供一些演示代码吗？

It's impossible to change the way jvm works, so multithreading may not be a bad idea.改变 jvm 的工作方式是不可能的，所以多线程可能不是一个坏主意。 It got several times faster after using multithreading in my demo project.在我的演示项目中使用多线程后，它的速度提高了好几倍。

Answer 2

From your description it seems like the complete transformation is running in a single thread.从您的描述看来，完整的转换似乎是在单个线程中运行的。

You could create multiple threads, each one is transforming one class at the time.您可以创建多个线程，每个线程都在转换一个 class。 As the transformation of a class should be independent of any other class.由于 class 的转换应该独立于任何其他 class。 This should give you an improvement in the overall transformation time by a factor of the number of used Core available on the executing system.这应该可以通过执行系统上可用的已用核心数量的因素来缩短整体转换时间。

You can count the cores with:您可以使用以下方法计算核心：

int cores = Runtime.getRuntime().availableProcessors();

Chunk the list of classes to be transformed into the number of cores and create that may threads to process the chunks in parallel.将要转换为核心数量的类列表分块，并创建可以并行处理块的线程。

如何加快运行时 Java 代码检测？

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-07-10 14:39:23

解决方案2
0 2020-07-08 18:21:10

如何加快运行时 Java 代码检测？

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-07-10 14:39:23

解决方案2 0 2020-07-08 18:21:10

解决方案1
2 已采纳 2020-07-10 14:39:23

解决方案2
0 2020-07-08 18:21:10