[英]What can I do when I keep exceeding memory used while using Dask-ML
I am using Dask-ML to run some code which uses quite a bit of RAM memory during training.我正在使用 Dask-ML 运行一些代码,这些代码在训练期间使用了相当多的 RAM memory。 The training dataset itself is not large but it's during training which uses a fair bit of RAM memory.
训练数据集本身并不大,但它在训练期间使用了相当多的 RAM memory。 I keep getting the following error message, even though I have tried using different values for
n_jobs
:即使我尝试对
n_jobs
使用不同的值,我仍然收到以下错误消息:
distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting
What can I do?我能做些什么?
Ps: I have also tried using Kaggle Kernel (which allows up to 16GB RAM) and this didn't work. Ps:我也尝试过使用 Kaggle Kernel(允许高达 16GB 的 RAM),但没有成功。 So I am trying Dask-ML now.
所以我现在正在尝试 Dask-ML。 I am also just connected to the Dask cluster using its default parameter values, with the code below:
我也只是使用其默认参数值连接到 Dask 集群,代码如下:
from dask.distributed import Client
import joblib
client = Client()
with joblib.parallel_backend('dask'):
# My own codes
Dask has a detailed page on techniques to help with memory management . Dask 有一个详细的页面,介绍有助于memory 管理的技术。 You might also be interested in configuring spilling to disk Dask workers .
您可能还对配置溢出到磁盘 Dask 工作人员感兴趣。 For example, rather
例如,宁可
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.