简体   繁体   English

dask jobqueue 动态分配的工人资源

[英]dask jobqueue dynamically allocated worker resources

I'm a new user of dask.jobqueue and I'm trying to schedule tasks thanks to it.我是 dask.jobqueue 的新用户,由于它,我正在尝试安排任务。

The number of tasks to launch can vary (from a few dozen to several hundred) and the graph of dependencies between these tasks can be quite complex.要启动的任务数量可能会有所不同(从几十个到几百个),并且这些任务之间的依赖关系图可能非常复杂。 Also, the resource demands of each of these tasks can vary greatly because some will be highly multi-threaded while others will not.此外,每个任务的资源需求可能会有很大差异,因为有些是高度多线程的,而有些则不是。 However, I don't have the impression that it's possible at the moment to dynamically vary the resources of the "worker".但是,我不认为目前可以动态改变“工人”的资源。

for instance the code snippet :例如代码片段:

@dask.delayed()
def first_task(...)
    """mono thread task
    """
...
@dask.delayed()
def second_task(...)
    """higly multithreaded task
    """
...

cluster = PBSCluster(...)
cluster.scale(...) # or cluster.adapt(...)
client = Client(cluster)

first_return = first_task()# mono thread
second_return = second_task(first_return) # threaded
third_task(second_return) # highly multitraded

# launch tasks
third_task.compute()

The treatments are triggered thanks to .compute() which can take the named parameter 'resources' but from what I understand ( https://distributed.dask.org/en/latest/resources.html ) corresponds to resources previously informed to each worker at the creation of the cluster, which does not correspond to a dynamic management of the resources per worker that I am looking for.由于 .compute() 可以采用命名参数 'resources' 触发处理,但据我所知( https://distributed.dask.org/en/latest/resources.html )对应于先前通知每个创建集群时的工作人员,这与我正在寻找的每个工作人员的资源动态管理不对应。

Basically, is it possible to switch from :基本上,是否可以从以下位置切换:

third_task.compute(resources={...})

to something more related to each of the tasks to be processed?与要处理的每个任务更相关的东西? Attach somehow resources needed for each dask.delayed?以某种方式附加每个 dask.delayed 所需的资源?

@dask.delayed(resources={"cpu":1, "ram": "5GB"})
def first_task(...)
...
@dask.delayed(resources={"cpu":24, "ram": "120GB"})
def third_task(...)
...
third_task.compute()

Thanks,谢谢,

截至 2020-03-27 答案是否定的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM