简体   繁体   English

Java多线程未利用所有内核

[英]java multithreading not utilizing all cores

I have a multi-threaded program which doesn't seem to actually utilize all cores in my machine. 我有一个多线程程序,该程序似乎并未真正利用机器中的所有内核。 Here is the code and any input would be highly appreciated. 这是代码,任何输入将不胜感激。

Main Class 主班

public class MainClass{
 public static void main(String [] args){
  Work work=new Work();
  work.doIt();
 }
}

The second class creates the tasks and hand them to the ExecutorService, here is the pseudo-code 第二类创建任务并将其交给ExecutorService,这是伪代码

public class Work{
 public void doIt() throws InterrputedException, Exception{
  map=get some data and put it in the map;
  ArrayList<Future<Integer>> list=new ArrayList<Future<Integer>>();
  ArrayList<WorkCallable>jobs=new ArrayList<WorkCallable>();
  for each entry in the map;
    jobs.add(new WorkCallable(entry);
  int numCores=Runtime.getRuntime().availableProcessors();
  ExecutorService executor=Executors.newFixedThreadPool(numCores);
  int size=jobs.size();
  for(int i=0;i<size;i++){
    Callable<Integer> worker=jobs.get(i);
    Future<Integer> submit=executor.submit(worker);
    list.add(submit);
  }
  executor.shutdown();
  while(!executor.isTerminated()) {}
  do something with the returned data;
}
}

The Callable class 可通话类

public class WorkCallable implements Callable<Integer>{
 @Override
 public Integer call() throws Exception{
   Properties props=new Properties();
   props.put("annotators", "tokenize, ssplit, pos");
   StanfordCoreNLP pipeline=new StanfordCoreNLP(props);
   for(String id:entry.keySet()){
   Annotation document=new Annotation(entry.get(id));
   pipeline.annotate(document);

   process the data;
   return an integer value;
 }
}

The problem is that, when I check how many threads are running, I found only very few and it seems that the executor is not taking advantages of ideal cores! 问题是,当我检查正在运行的线程数时,我发现只有很少的线程,而且执行程序似乎没有利用理想的内核!

I hope the description is clear. 我希望描述清楚。

Update : 更新

  • The library used is the StanfordCoreNLP package to process the text passed to the Callable object as a Map of documentID and its content. 使用的库是StanfordCoreNLP包,用于处理作为documentID及其内容的Map传递到Callable对象的文本。 Processing the data is not an issue as I have that working just fine without the inclusion of the StanfordCoreNLP library. 处理数据不是问题,因为我认为无需包含StanfordCoreNLP库就可以正常工作。 In other words, shallow processing of the documents works fine and utilize all cores. 换句话说,对文档进行浅层处理可以很好地利用所有核心。 But when I include this package it doesn't. 但是,当我包含此软件包时,它没有。

If you are using Windows then the JVM will delegate the thread sheduling the the NT Kernel. 如果您使用的是Windows,则JVM将委派处理NT内核的线程。 POSIX type operating systems map the OS threads directly with the JVM and schedules co-operatively. POSIX类型的操作系统将OS线程直接与JVM映射,并进行协作调度。

However, whatever happens, you won't be able to ensure that the threads are assigned evenly across the cores/processors. 但是,无论发生什么情况,您都无法确保线程在内核/处理器之间平均分配。 Something else on the OS could be running on core 4 when you start your 4th thread, so it might get scheduled towards another core. 当您启动第4个线程时,操作系统上的其他内容可能会在核心4上运行,因此可能会将其安排在另一个核心上。 Or the scheduler could decide to stack them on the same core. 或者调度程序可以决定将它们堆叠在同一内核上。

At this point with the information you have provided I would suspect that there is some contention among threads so probability is that some of threads are blocked/ waiting. 在这一点上,根据您提供的信息,我怀疑线程之间存在某些争用,因此很可能某些线程被阻塞/正在等待。 To verify this you can use JVisual VM and take thread dump(Jconsole is also an option). 为了验证这一点,您可以使用JVisual VM并进行线程转储(Jconsole也可以选择)。 JVisual VM is utility to monitor java Application and comes with JDK. JVisual VM是用于监视Java应用程序的实用程序,它是JDK附带的。 If you haven't use that before this would be good investment of your time to learn about it as it is very useful and simple to use. 如果您之前没有使用过它,那将是您的宝贵时间来学习它,因为它非常有用且易于使用。

See Here for JVisualVM 有关JVisualVM的信息,请参见此处

  1. Connect to you program using JVisual VM Take Thread dump. 使用JVisual VM Take Thread dump连接到您的程序。
  2. It would provide you state of threads in your program at that instance of time, if there is contention and/or blocking it would be easy to spot using thread dump. 它将在该时间实例中为您提供程序中线程的状态,如果存在争用和/或阻塞,则使用线程转储很容易发现。
  3. Feel free to paste it here in case you cannot make out what is happening in Thread dump, though there are number of resources for you to understand thread dump on web 可以将其粘贴到此处,以防万一您无法了解线程转储中发生的情况,尽管有很多资源可以让您了解Web上的线程转储

On other note as pointed out by @Marko you could be more efficient in handling executors shutdown and I would say ExecutorCompletionService would fit your requirement and make code more elegant and easy to read. 在其他方面,如@Marko所指出的那样,您可以更有效地处理执行程序关闭,并且我想说ExecutorCompletionService可以满足您的要求,并使代码更优雅,更易于阅读。 Check here for ExecutorCompletionService Once you figure out idle cores may be you can refactor to use ECS. 在此处检查ExecutorCompletionService。一旦确定空闲核心,便可以重构使用ECS。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM