简体   繁体   中英

Can I specify a max memory for the default dask scheduler on a shared machine?

I am working on a shared analysis node at Princeton University.

I often encounter problems with my dask processes being killed due to large memory consumption. This seems to happen as a precaution from the admin side, to avoid an unstable system.

To control the resources I usually use a LocalCluster via dask.distributed, but in this particular instance this prevents me from using a numerically efficient algorithm implemented with numba (see here for a discussion of the problem ).

I did find an answer for specifying the amount of threads to be used here , but is there a similar way to specify a maximum amount of memory for the threaded scheduler?

No, Dask will not control the memory of the scheduler process. If it is growing large in memory then this is a sign that you're probably misusing it a bit. Ideally the scheduler never stores any of your data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM