简体   繁体   中英

Dask workers not at 100%

When running Dask distributed on a cluster of 8 machines, each one having 8 cores (64 cores in total), I get this strange task stream:

任务流

There are some white spaces between tasks (white "columns") which seems to appear randomly. Ideally (as I understand), workers must be always occupied with some pending task (as soon as a worker is free, a task is assigned to it). This is the main loop of my script which is generating the former figure:

task_pool = as_completed(futures, with_results=True)
batches = task_pool.batches()

while not self.stopping_condition_is_met():
    batch = next(batches)
    for _, received_solution in batch:
        ...
        new_task = self.client.submit(heavy_computation, args)
        task_pool.add(new_task)

        update_condition()
        if self.stopping_condition_is_met():
            break

I have noticed that in those periods I have 1-2 processing tasks and 100-120 in-memory tasks, that suddenly change to 30-40 and 80-100. Why is this happening?

It's hard to say precisely, but my guess is that there just isn't enough work to keep all of your workers busy all the time. Ideally you would be able to keep many more tasks live than you have worker-threads. If you have only 1-2 processing tasks then only 1-2 of your threads can be active at once. Even if you have 30 active then you're only using half of your cluster.

Maybe there is some way that you can split up your work a bit more, or keep more work available?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM