简体   繁体   English

当我在使用 Dask-ML 时一直超过 memory 时该怎么办

[英]What can I do when I keep exceeding memory used while using Dask-ML

I am using Dask-ML to run some code which uses quite a bit of RAM memory during training.我正在使用 Dask-ML 运行一些代码,这些代码在训练期间使用了相当多的 RAM memory。 The training dataset itself is not large but it's during training which uses a fair bit of RAM memory.训练数据集本身并不大,但它在训练期间使用了相当多的 RAM memory。 I keep getting the following error message, even though I have tried using different values for n_jobs :即使我尝试对n_jobs使用不同的值,我仍然收到以下错误消息:

distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting

What can I do?我能做些什么?

Ps: I have also tried using Kaggle Kernel (which allows up to 16GB RAM) and this didn't work. Ps:我也尝试过使用 Kaggle Kernel(允许高达 16GB 的 RAM),但没有成功。 So I am trying Dask-ML now.所以我现在正在尝试 Dask-ML。 I am also just connected to the Dask cluster using its default parameter values, with the code below:我也只是使用其默认参数值连接到 Dask 集群,代码如下:

from dask.distributed import Client
import joblib

client = Client()

with joblib.parallel_backend('dask'):
    # My own codes

Dask has a detailed page on techniques to help with memory management . Dask 有一个详细的页面,介绍有助于memory 管理的技术。 You might also be interested in configuring spilling to disk Dask workers .您可能还对配置溢出到磁盘 Dask 工作人员感兴趣。 For example, rather例如,宁可

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM