简体   繁体   English

如何在单个线程中运行dask.distributed集群?

[英]How do I run a dask.distributed cluster in a single thread?

How can I run a complete Dask.distributed cluster in a single thread? 如何在单个线程中运行完整的Dask.distributed集群? I want to use this for debugging or profiling. 我想用它来进行调试或分析。

Note: this is a frequently asked question. 注意:这是一个经常被问到的问题。 I'm adding the question and answer here to Stack Overflow just for future reuse. 我在这里将问题和答案添加到Stack Overflow中,以便将来重用。

Local Scheduler 本地调度程序

If you can get by with the single-machine scheduler's API (just compute) then you can use the single-threaded scheduler 如果您可以使用单机调度程序的API(只是计算),那么您可以使用单线程调度程序

x.compute(scheduler='single-threaded')

Distributed Scheduler - Single Machine 分布式调度程序 - 单机

If you want to run a dask.distributed cluster on a single machine you can start the client with no arguments 如果要在单个计算机上运行dask.distributed集群,则可以不带参数启动客户端

from dask.distributed import Client
client = Client()  # Starts local cluster
x.compute()

This uses many threads but operates on one machine 这使用许多线程但在一台机器上运行

Distributed Scheduler - Single Process 分布式调度程序 - 单个进程

Alternatively if you want to run everything in a single process then you can use the processes=False keyword 或者,如果要在单个进程中运行所有内容,则可以使用processes=False关键字

from dask.distributed import Client
client = Client(processes=False)  # Starts local cluster
x.compute()

All of the communication and control happen in a single thread, though computation occurs in a separate thread pool. 所有的通信和控制都发生在一个线程中,尽管计算发生在一个单独的线程池中。

Distributed Scheduler - Single Thread 分布式调度程序 - 单线程

To run control, communication, and computation all in a single thread you need to create a Tornado concurrent.futures Executor. 要在一个线程中运行控制,通信和计算,您需要创建一个Tornado concurrent.futures Executor。 Beware, this Tornado API may not be public. 请注意,此Tornado API可能不公开。

from dask.distributed import Scheduler, Worker, Client
from tornado.concurrent import DummyExecutor
from tornado.ioloop import IOLoop
import threading

loop = IOLoop()
e = DummyExecutor()
s = Scheduler(loop=loop)
s.start()
w = Worker(s.address, loop=loop, executor=e)
loop.add_callback(w._start)

async def f():
    async with Client(s.address, start=False) as c:
        future = c.submit(threading.get_ident)
        result = await future
        return result

>>> threading.get_ident() == loop.run_sync(f)
True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM