发送到ExecutorService的作业的运行时间

Question

美好的一天，

我正在编写一个程序，其中对从文本文件读取的每一行调用一个方法。 由于此方法的每次调用均独立于其他任何行读取，因此我可以并行调用它们。 为了最大限度地利用cpu，我使用ExecutorService来提交每个run（）调用。 由于文本文件有1500万行，因此我需要错开ExecutorService运行以一次不创建太多作业（OutOfMemory异常）。 我还想跟踪每次提交的运行的运行时间，因为我发现有些运行尚未完成。 问题是，当我尝试将Future.get方法与超时一起使用时，超时是指它进入ExecutorService队列的时间，而不是指它甚至从开始运行就开始运行的时间。 我想花一些时间，因为它开始运行，而不是因为它进入了队列。

代码如下：

ExecutorService executorService= Executors.newFixedThreadPool(ncpu);
line = reader.readLine();
long start = System.currentTimeMillis();
HashMap<MyFut,String> runs = new HashMap<MyFut, String>();
HashMap<Future, MyFut> tasks = new HashMap<Future, MyFut>();
while ( (line = reader.readLine()) != null ) { 

String s = line.split("\t")[1];
final String m = line.split("\t")[0];
MyFut f = new MyFut(s, m);
tasks.put(executorService.submit(f), f);

runs.put(f, line);

while (tasks.size()>ncpu*100){
    try {
        Thread.sleep(100);
    } catch (InterruptedException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    Iterator<Future> i = tasks.keySet().iterator();
    while(i.hasNext()){
        Future task = i.next();
        if (task.isDone()){
            i.remove();

        } else {
            MyFut fut = tasks.get(task);
            if (fut.elapsed()>10000){
                System.out.println(line);
                task.cancel(true);
                i.remove();
            }
        }
    }
}
}

private static class MyFut implements Runnable{

private long start;
String copy;
String id2;

public MyFut(String m, String id){
    super();

    copy=m;
    id2 = id;
}

public long elapsed(){
    return System.currentTimeMillis()-start;
}



@Override
public void run() {
    start = System.currentTimeMillis();
    do something...
}

}

如您所见，我尝试跟踪已发送的作业数，如果超过了阈值，我会稍等片刻，直到一些作业完成。 我还要尝试检查是否有任何作业花费太长时间才能取消它，请牢记哪个失败，然后继续执行。 这不是我希望的那样。 一项任务执行10秒的时间远远超出了需要的时间（根据机器和CPU的数量，我会在70到130秒内完成1000行代码）。

我究竟做错了什么？ 我的Runnable类中的run方法是否不应该仅在ExecutorService中的某些线程空闲并开始对其工作时才调用？ 我得到许多结果，这些结果花费了超过10秒的时间。 有没有更好的方法来实现我的目标？

谢谢。

Answer 1

如果使用的是Future，我建议将Runnable更改为Callable并返回执行线程的总时间作为结果。 下面是示例代码：

import java.util.concurrent.Callable;

public class MyFut implements Callable<Long> {

    String copy;
    String id2;

    public MyFut(String m, String id) {
        super();

        copy = m;
        id2 = id;
    }

    @Override
    public Long call() throws Exception {
        long start = System.currentTimeMillis();
        //do something...
        long end = System.currentTimeMillis();
        return (end - start);
    }
}

Answer 2

您正在使工作更加努力。 Java的框架提供了您想要的一切，您只需要使用它即可。

限制待审批工作项的数目使用界队列的作品，但ExecutorService由归国Executors.newFixedThreadPool()使用未绑定的队列。 一旦有界队列已满，要等待的策略可以通过RejectedExecutionHandler来实现。 整个过程看起来像这样：

static class WaitingRejectionHandler implements RejectedExecutionHandler {
  public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
    try {
      executor.getQueue().put(r);// block until capacity available
    } catch(InterruptedException ex) {
      throw new RejectedExecutionException(ex);
    }
  }
}
public static void main(String[] args)
{
  final int nCPU=Runtime.getRuntime().availableProcessors();
  final int maxPendingJobs=100;
  ExecutorService executorService=new ThreadPoolExecutor(nCPU, nCPU, 1, TimeUnit.MINUTES,
    new ArrayBlockingQueue<Runnable>(maxPendingJobs), new WaitingRejectionHandler());

  // start flooding the `executorService` with jobs here

就这样。

测量作业中所经过的时间是很容易的，因为它没有任何关系与多线程：

long startTime=System.nanoTime();
// do your work here
long elpasedTimeSoFar = System.nanoTime()-startTime;

但是，一旦使用有限队列，也许您就不再需要它了。

顺便说Future.get带有超时的Future.get方法不引用自从它进入ExecutorService队列以来的时间，而是引用调用get方法本身的时间。 换句话说，它告诉get方法允许等待多长时间，仅此而已。

发送到ExecutorService的作业的运行时间

问题描述

2 个解决方案

解决方案1
2 2013-12-05 10:24:14

解决方案2
1 已采纳 2013-12-05 10:57:01

发送到ExecutorService的作业的运行时间

问题描述

2 个解决方案

解决方案1 2 2013-12-05 10:24:14

解决方案2 1 已采纳 2013-12-05 10:57:01

解决方案1
2 2013-12-05 10:24:14

解决方案2
1 已采纳 2013-12-05 10:57:01