简体   繁体   English

任务期间的 Dask 进度

[英]Dask progress during task

With dask dataframe using使用 dask 数据框
df = dask.dataframe.from_pandas(df, npartitions=5) series = df.apply(func) future = client.compute(series) progress(future)

In a jupyter notebook I can see progress bar for how many apply() calls completed per partition (eg 2/5).在 jupyter notebook 中,我可以看到每个分区完成了多少 apply() 调用的进度条(例如 2/5)。
Is there a way for dask to report progress inside each partition? dask 有没有办法报告每个分区内的进度?
Something like tqdm progress_apply() for pandas.像熊猫的tqdm progress_apply()类的东西。

If you mean, how complete each call of func() is, then no, there is no way for Dask to know that.如果您的意思是,每次调用func()有多完整,那么不,Dask 无法知道这一点。 Dask calls python functions which run in their own python thread (python threads cannot be interrupted by another thread), and Dask only knows whether the call is done or not. Dask 调用运行在自己的python 线程中的python 函数(python 线程不能被其他线程中断),Dask 只知道调用是否完成。

You could perhaps conceive of calling a function which has some internal callbacks or other reporting system, but I don't think I've seen anything like that.您或许可以设想调用一个具有一些内部回调或其他报告系统的函数,但我认为我没有见过类似的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM