简体   繁体   English

运行功能之前,dask是否等待资源可用?

[英]Does dask wait for resources to be available before running a function?

I'm working with some code that I plan on running on a server in the near future. 我正在使用一些计划在不久的将来在服务器上运行的代码。 Right now it works on my local machine, but multiple people will be running the program at the same time. 现在,它可以在我的本地计算机上运行,​​但是将有多个人同时运行该程序。 I'm worried that they will use more ram or vram than available. 我担心他们会使用更多的ram或vram。 If I use dask will it wait for available resources before executing the function call? 如果我使用dask,它将在执行函数调用之前等待可用资源吗?

Example Code 范例程式码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from numba import njit
import numpy as np
from dask.distributed import Client, LocalCluster

@njit
def addingNumbers (big_array, big_array2, save_array):
    for i in range (big_array.shape[0]):
        for j in range (big_array.shape[1]):
            save_array[i][j] = big_array[i][j] * big_array2[i][j]

    return save_array


if __name__ == "__main__":
    cluster = LocalCluster()
    client = Client(cluster)


    big_array = np.random.random_sample((100, 3000))
    big_array2  = np.random.random_sample((100, 3000))
    save_array = np.zeros(shape=(100, 3000))


    x = client.submit(addingNumbers, big_array, big_array2, save_array)
    y = client.gather(x)

If multiple people were running the above code at the same time and the server was almost out of ram, would dask wait until ram was available to submit the function, or would it submit it and the server would get an out of memory error? 如果有多个人同时运行上述代码,并且服务器几乎快要用完ram了,请问dask会等到ram可用来提交函数,还是会提交它,然后服务器会出现内存不足的错误?

If dask doesn't wait till ram is available, how would you queue the function call? 如果dask不等到ram可用,您将如何对函数调用进行排队? Thanks 谢谢

If I use dask will it wait for available resources before executing the function call? 如果我使用dask,它将在执行函数调用之前等待可用资源吗?

Dask is unable to predict how much RAM your function will need. Dask无法预测您的功能将需要多少RAM。 However, you can set a memory limit on stored data and if Dask reaches that limit then it will stop running tasks once it reaches that limit and instead push some to disk. 但是,您可以对存储的数据设置内存限制,如果Dask达到该限制,则一旦达到该限制,它将停止运行任务,而是将一些任务推入磁盘。 See https://distributed.dask.org/en/latest/worker.html#memory-management 参见https://distributed.dask.org/en/latest/worker.html#memory-management

how would you queue the function call? 您如何将函数调用排队?

The simplest solution would be to limit the number of active threads in a worker, or to use Worker resources to limit concurrency of only certain tasks per worker. 最简单的解决方案是限制工作程序中活动线程的数量,或使用工作程序资源来限制每个工作程序仅某些任务的并发性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Function `print` 不等待它之前的 function - Function `print` does not wait for the function before it Streamz/Dask:收集不等待缓冲区的所有结果 - Streamz/Dask: gather does not wait for all results of buffer 在再次运行之前等待线程完成 - Wait for threads to finish before running them again 在应用函数之前强制Dask Delayed对象计算所有参数 - Force a Dask Delayed object to compute all parameters before applying the function dask.distributed:等待所有任务在关机前完成(没有期货) - dask.distributed: wait for all tasks to finish before shutdown (without futures) 为什么在过滤后的 Dask 数据帧上运行计算()需要这么长时间? - Why does running compute() on a filtered Dask dataframe take so long? 在 juputerlab 上运行时 dask 在哪里存储文件 - Where does dask store files while running on juputerlab Dask Distributed - 如何为每个工作人员运行一个任务,使该任务在所有可用的核心上运行? - Dask Distributed - how to run one task per worker, making that task running on all cores available into the worker? Prefect 2 Dask:提交不消耗资源 - Prefect 2 Dask: submit Not Consuming Resources PyQT5:在继续函数之前等待信号 - PyQT5: wait for signal before continuing the function
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM