简体   繁体   English

Java中每个键的线程池

[英]Thread Pool per key in Java

Suppose that you have a grid G of nxm cells, where n and m are huge.假设您有一个由nxm单元组成的网格G ,其中nm很大。 Further, suppose that we have numerous tasks, where each task belong to a single cell in G, and should be executed in parallel (in a thread pool or other resource pool).此外,假设我们有多个任务,其中每个任务都属于 G 中的单个单元格,并且应该并行执行(在线程池或其他资源池中)。

However, task belonging to the same cell must be done serially, that is, it should wait that previous task in the same cell to be done.但是,属于同一个单元格的任务必须是串行完成的,也就是说,它应该等待同一个单元格中的前一个任务完成。

How can I solve this issue?我该如何解决这个问题? I've search and used several thread pools (Executors, Thread), but no luck.我搜索并使用了几个线程池(Executors、Thread),但没有运气。

Minimum Working Example最小工作示例

import java.util.Random;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class MWE {

    public static void main(String[] args) {
        ExecutorService threadPool = Executors.newFixedThreadPool(16);
        Random r = new Random();

        for (int i = 0; i < 10000; i++) {
            int nx = r.nextInt(10);
            int ny = r.nextInt(10);

            Runnable task = new Runnable() { 
                public void run() { 
                  try {
                    System.out.println("Task is running"); 
                    Thread.sleep(1000);
                  } catch (InterruptedException e) {
                    e.printStackTrace();
                  }
                } 
            };

            threadPool.submit(new Thread(task)); // Should use nx,ny here somehow
        }
    }

}

If I get you right, you want to execute X tasks (X is very big) in Y queues (Y is much smaller than X).如果我没猜错,您想在 Y 队列(Y 远小于 X)中执行 X 个任务(X 非常大)。
Java 8 has CompletableFuture class, which represents an (asynchronous) computation. Java 8 有CompletableFuture类,它表示(异步)计算。 Basically, it's Java's implementation of Promise .基本上,它是Promise的 Java 实现。 Here is how you can organize a chain of computations (generic types omitted):以下是组织计算链的方法(省略了泛型类型):

// start the queue with a "completed" task
CompletableFuture queue = CompletableFuture.completedFuture(null);  
// append a first task to the queue
queue = queue.thenRunAsync(() -> System.out.println("first task running"));  
// append a second task to the queue
queue = queue.thenRunAsync(() -> System.out.println("second task running"));
// ... and so on

When you use thenRunAsync(Runnable) , tasks will be executed using a thread pool (there are other possibilites - see API docs ).当您使用thenRunAsync(Runnable) ,将使用线程池执行任务(还有其他可能性 - 请参阅API 文档)。 You can also supply your own thread pool as well.您也可以提供自己的线程池。 You can create Y of such chains (possibly keeping references to them in some table).您可以创建 Y 个这样的链(可能在某个表中保留对它们的引用)。

This is were systems like Akka in java world make sense.If both X and Y are large, you may want to look at processing them using a message passing mechanism rather than bunch them up in a huge chain of callbacks and futures.这是像 java 世界中的 Akka 这样的系统是有道理的。如果 X 和 Y 都很大,您可能需要考虑使用消息传递机制处理它们,而不是将它们聚集在一个巨大的回调和期货链中。 One actor has the list of tasks to be done and is handed a cell and the actor would eventually compute the result and persist it.一个参与者拥有要完成的任务列表,并被交给一个单元格,参与者最终会计算结果并将其持久化。 If something fails in the intermediate step, it's not end of world.如果中间步骤出现问题,这不是世界末日。

A callback mechanism with a synchronized block could work efficiently here.带有同步块的回调机制可以在这里有效地工作。 I have previously answered a similar question here .我以前在这里回答过类似的问题。 There are some limitations (see the linked answer), but it is simple enough to keep track of what is going on (good maintainability).有一些限制(请参阅链接的答案),但它很简单,可以跟踪正在发生的事情(良好的可维护性)。 I have adapted the source code and made it more efficient for your case where most tasks will be executed in parallel (since n and m are huge), but on occasion must be serial (when a task is for the same point in the grid G ).我已经修改了源代码,并使其在大多数任务将并行执行的情况下更有效(因为nm很大),但有时必须是串行的(当任务针对网格G的同一点时) )。

import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.locks.ReentrantLock;

// Adapted from https://stackoverflow.com/a/33113200/3080094
public class GridTaskExecutor {

    public static void main(String[] args) {

        final int maxTasks = 10_000;
        final CountDownLatch tasksDone = new CountDownLatch(maxTasks);
        ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(16);
        try {
            GridTaskExecutor gte = new GridTaskExecutor(executor); 
            Random r = new Random();

            for (int i = 0; i < maxTasks; i++) {

                final int nx = r.nextInt(10);
                final int ny = r.nextInt(10);

                Runnable task = new Runnable() { 
                    public void run() { 
                        try {
                            // System.out.println("Task " + nx + " / " + ny + " is running");
                            Thread.sleep(1);
                        } catch (Exception e) {
                            e.printStackTrace();
                        } finally {
                            tasksDone.countDown();
                        }
                    } 
                };
                gte.addTask(task, nx, ny);
            }
            tasksDone.await();
            System.out.println("All tasks done, task points remaining: " + gte.size());
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            executor.shutdownNow();
        }
    }

    private final Executor executor;
    private final Map<Long, List<CallbackPointTask>> tasksWaiting = new HashMap<>();
    // make lock fair so that adding and removing tasks is balanced.
    private final ReentrantLock lock = new ReentrantLock(true);

    public GridTaskExecutor(Executor executor) {
        this.executor = executor;
    }

    public void addTask(Runnable r, int x, int y) {

        Long point = toPoint(x, y);
        CallbackPointTask pr = new CallbackPointTask(point, r);
        boolean runNow = false;
        lock.lock();
        try {
            List<CallbackPointTask> pointTasks = tasksWaiting.get(point);
            if (pointTasks == null) {
                if (tasksWaiting.containsKey(point)) {
                    pointTasks = new LinkedList<CallbackPointTask>();
                    pointTasks.add(pr);
                    tasksWaiting.put(point, pointTasks);
                } else {
                    tasksWaiting.put(point, null);
                    runNow = true;
                }
            } else {
                pointTasks.add(pr);
            }
        } finally {
            lock.unlock();
        }
        if (runNow) {
            executor.execute(pr);
        }
    }

    private void taskCompleted(Long point) {

        lock.lock();
        try {
            List<CallbackPointTask> pointTasks = tasksWaiting.get(point);
            if (pointTasks == null || pointTasks.isEmpty()) {
                tasksWaiting.remove(point);
            } else {
                System.out.println(Arrays.toString(fromPoint(point)) + " executing task " + pointTasks.size());
                executor.execute(pointTasks.remove(0));
            }
        } finally {
            lock.unlock();
        }
    }

    // for a general callback-task, see https://stackoverflow.com/a/826283/3080094
    private class CallbackPointTask implements Runnable {

        final Long point;
        final Runnable original;

        CallbackPointTask(Long point, Runnable original) {
            this.point = point;
            this.original = original;
        }

        @Override
        public void run() {

            try {
                original.run();
            } finally {
                taskCompleted(point);
            }
        }
    }

    /** Amount of points with tasks. */ 
    public int size() {

        int l = 0;
        lock.lock();
        try {
            l = tasksWaiting.size(); 
        } finally {
            lock.unlock();
        }
        return l;
    }

    // https://stackoverflow.com/a/12772968/3080094
    public static long toPoint(int x, int y) {
        return (((long)x) << 32) | (y & 0xffffffffL);
    }

    public static int[] fromPoint(long p) {
        return new int[] {(int)(p >> 32), (int)p };
    }

}

You can create a list of n Executors.newFixedThreadPool(1) .您可以创建一个包含n 个Executors.newFixedThreadPool(1)的列表。 Then submit to the corresponding thread by using a hash function.然后通过hash函数提交给对应的线程。 Ex.前任。 threadPool[key%n].submit(new Thread(task)) . threadPool[key%n].submit(new Thread(task))

This library should do the job: https://github.com/jano7/executor这个库应该可以完成这项工作: https : //github.com/jano7/executor

int maxTasks = 16;
ExecutorService threadPool = Executors.newFixedThreadPool(maxTasks);
KeySequentialBoundedExecutor executor = new KeySequentialBoundedExecutor(maxTasks, threadPool);

Random r = new Random();

for (int i = 0; i < 10000; i++) {
    int nx = r.nextInt(10);
    int ny = r.nextInt(10);

    Runnable task = new Runnable() {

    public void run() { 
        try {
            System.out.println("Task is running"); 
            Thread.sleep(1000);
        } catch (InterruptedException e) {
                e.printStackTrace();
            }
        } 
    };

    executor.execute(new KeyRunnable<>((ny * 10) + nx, task));
}

The Scala example given below demonstrates how keys in a map can be executed in parallel and values of a key are executed in serial.下面给出的 Scala 示例演示了如何并行执行映射中的键以及如何串行执行键的值。 Change it to Java syntax if you want to try it in Java (Scala uses JVM libraries).如果您想在 Java 中尝试,请将其更改为 Java 语法(Scala 使用 JVM 库)。 Basically chain the tasks future to have them execute sequentially.基本上将任务链接起来,让它们按顺序执行。

import java.util.concurrent.{CompletableFuture, ExecutorService, Executors, Future, TimeUnit}
import scala.collection.concurrent.TrieMap
import scala.collection.mutable.ListBuffer
import scala.util.Random

/**
 * For a given Key-Value pair with tasks as values, demonstrates sequential execution of tasks
 * within a key and parallel execution across keys.
 */
object AsyncThreads {

  val cachedPool: ExecutorService = Executors.newCachedThreadPool
  var initialData: Map[String, ListBuffer[Int]] = Map()
  var processedData: TrieMap[String, ListBuffer[Int]] = TrieMap()
  var runningTasks: TrieMap[String, CompletableFuture[Void]] = TrieMap()

  /**
   * synchronous execution across keys and values
   */
  def processSync(key: String, value: Int, initialSleep: Long) = {
    Thread.sleep(initialSleep)
    if (key.equals("key_0")) {
      println(s"${Thread.currentThread().getName} -> sleep: $initialSleep. Inserting key_0 -> $value")
    }
    processedData.getOrElseUpdate(key, new ListBuffer[Int]).addOne(value)
  }

  /**
   * parallel execution across keys
   */
  def processASync(key: String, value: Int, initialSleep: Long) = {
    val task: Runnable = () => {
      processSync(key, value, initialSleep)
    }

    // 1. Chain the futures for sequential execution within a key
    val prevFuture = runningTasks.getOrElseUpdate(key, CompletableFuture.completedFuture(null))
    runningTasks.put(key, prevFuture.thenRunAsync(task, cachedPool))

    // 2. Parallel execution across keys and values
    // cachedPool.submit(task)
  }

  def process(key: String, value: Int, initialSleep: Int): Unit = {
    //processSync(key, value, initialSleep) // synchronous execution across keys and values
    processASync(key, value, initialSleep) // parallel execution across keys
  }

  def main(args: Array[String]): Unit = {

    checkDiff()

    0.to(9).map(kIndex => {
      var key = "key_" + kIndex
      var values = ListBuffer[Int]()
      initialData += (key -> values)
      1.to(10).map(vIndex => {
        values += kIndex * 10 + vIndex
      })
    })

    println(s"before data:$initialData")

    initialData.foreach(entry => {
      entry._2.foreach(value => {
        process(entry._1, value, Random.between(0, 100))
      })
    })

    cachedPool.awaitTermination(5, TimeUnit.SECONDS)
    println(s"after data:$processedData")

    println("diff: " + (initialData.toSet diff processedData.toSet).toMap)
    cachedPool.shutdown()
  }

  def checkDiff(): Unit = {
    var a1: TrieMap[String, List[Int]] = new TrieMap()
    a1.put("one", List(1, 2, 3, 4, 5))
    a1.put("two", List(11, 12, 13, 14, 15))

    var a2: TrieMap[String, List[Int]] = new TrieMap()
    a2.put("one", List(2, 1, 3, 4, 5))
    a2.put("two", List(11, 12, 13, 14, 15))


    println("a1: " + a1)
    println("a2: " + a2)

    println("check.diff: " + (a1.toSet diff a2.toSet).toMap)
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM