简体   繁体   English

工人池,其中某些任务只能由某些工人完成

[英]Worker pool where certain tasks can only be done by certain workers

I have a lot of tasks that I'd like to execute a few at a time. 我有很多任务想一次执行。 The normal solution for this is a thread pool. 正常的解决方案是线程池。 However, my tasks need resources that only certain threads have. 但是,我的任务需要只有某些线程才能拥有的资源。 So I can't just farm a task out to any old thread; 因此,我不能只是将任务分配到任何旧线程上。 the thread has to have the resource the task needs. 线程必须具有任务所需的资源。

It seems like there should be a concurrency pattern for this, but I can't seem to find it. 似乎应该为此使用并发模式,但是我似乎找不到它。 I'm implementing this in Python 2 with multiprocessing, so answers in those terms would be great, but a generic solution is fine. 我正在用多处理功能在Python 2中实现此功能,因此用这些术语回答会很好,但是通用的解决方案很好。 In my case the "threads" are actually separate OS processes and the resources are network connections (and no, it's not a server, so (e)poll/select is not going to help). 就我而言,“线程”实际上是独立的OS进程,资源是网络连接(不,它不是服务器,因此(e)轮询/选择无济于事)。 In general, a thread/process can hold several resources. 通常,线程/进程可以容纳多个资源。

Here is a naive solution: put the tasks in a work queue and turn my thread pool loose on it. 这是一个幼稚的解决方案:将任务放在工作队列中,然后在其上释放线程池。 Have each thread check, "Can I do this task?" 让每个线程检查“我可以执行此任务吗?” If yes, do it; 如果是,请执行; if no, put it back in the queue. 如果否,则将其放回队列。 However, if each task can only be done by one of N threads, then I'm doing ~2N expensive, wasted accesses to a shared queue just to get one unit of work. 但是,如果每个任务只能由N个线程之一完成,那么我要做的是〜2N昂贵,浪费的访问共享队列的工作,只是为了获得一个工作单元。

Here is my current thought: have a shared work queue for each resource. 这是我当前的想法:为每个资源共享一个工作队列。 Farm out tasks to the matching queue. 将任务播种到匹配的队列中。 Each thread checks the queue(s) it can handle. 每个线程检查它可以处理的队列。

Ideas? 有想法吗?

A common approach to this is to not allocate resources to threads and queue the appropriate resource in with the data, though I appreciate that this is not always possible if a resource is bound to a particular thread. 一种常见的方法是不将资源分配给线程,也不将适当的资源与数据放入队列,尽管我知道,如果将资源绑定到特定线程,则并非总是可能的。

The idea of using a queue per resource with threads only popping objects from the queues containing objects it can handle may work. 在线程中对每个资源使用队列的想法只能从包含它可以处理的对象的队列中弹出对象。

It may be possible to use a semaphore+concurrentQueue array, indexed by resource, for signaling such threads and also providing a priority system, so eliminating most of the polling and wasteful requeueing. 可以使用按资源索引的semaphore + concurrentQueue数组来发信号通知此类线程,并提供优先级系统,因此消除了大多数轮询和浪费的重新排队。 I will have to think a bit more about that - it kinda depends on how the resources map to the threads. 我将对此进行更多考虑-它取决于资源如何映射到线程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM