简体   繁体   English

使用 GPU 在 Google Colab 上出现 Cupy memory 错误 - 但只是我第二次运行代码

[英]Cupy memory error on Google Colab with GPU - but only the second time I run the code

I am trying to do a matrix multiplication on two large arrays using Cupy since it is significantly faster (about 100x) than using the CPU.我正在尝试使用 Cupy 对两个大型 arrays 进行矩阵乘法,因为它比使用 CPU 快得多(大约 100 倍)。 My problem is that it works the first time I run it, but then the second time and so on it gives me a memory error.我的问题是我第一次运行它时它可以工作,但是第二次运行它给我一个 memory 错误。 It is a step in a loop so this is a problem, I can't be restarting the runtime each time.这是循环中的一个步骤,所以这是一个问题,我不能每次都重新启动运行时。

Here is the reproducible code with same array size and data type:这是具有相同数组大小和数据类型的可重现代码:

import cupy as cp
import datetime

cp.get_default_memory_pool().free_all_blocks()
cp.get_default_pinned_memory_pool().free_all_blocks()

x = cp.random.uniform(-1,1,size = (3000,300000))
w = cp.random.uniform(-1,1,size= (300000,1000))

start = datetime.now()
ans = cp.matmul(x,w)
stop = datetime.now()
print(stop-start)

Here is the error I get when I run it for the second time in the same runtime:这是我在同一运行时第二次运行它时遇到的错误:

---------------------------------------------------------------------------
OutOfMemoryError                          Traceback (most recent call last)
<ipython-input-5-43db33b58bc8> in <module>()
      2 cp.get_default_pinned_memory_pool().free_all_blocks()
      3 
----> 4 x = cp.random.uniform(-1,1,size = (3000,300000))
      5 w = cp.random.uniform(-1,1,size= (300000,1000))
      6 

4 frames
/usr/local/lib/python3.6/dist-packages/cupy/creation/basic.py in empty(shape, dtype, order)
     20 
     21     """
---> 22     return cupy.ndarray(shape, dtype, order=order)
     23 
     24 

cupy/core/core.pyx in cupy.core.core.ndarray.__init__()

cupy/cuda/memory.pyx in cupy.cuda.memory.alloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.MemoryPool.malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.MemoryPool.malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.SingleDeviceMemoryPool.malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.SingleDeviceMemoryPool._malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory._try_malloc()

OutOfMemoryError: Out of memory allocating 7,200,000,000 bytes (allocated so far: 9,624,000,000 bytes).

Can this be fixed?这可以解决吗? I'm trying to clear the GPU memory in the first two lines but not sure if thats correct.我正在尝试清除前两行中的 GPU memory 但不确定这是否正确。 Maybe using a dask array would work instead?也许使用 dask 数组可以代替? But can that be done while still using the GPU for speed?但是,在仍然使用 GPU 来提高速度的同时可以做到这一点吗?

Yes, using CuPy-backed Dask arrays would probably be fine here.是的,在这里使用 CuPy 支持的 Dask arrays 可能会很好。 You would want to make sure to use the single-threaded scheduler ( .compute(scheduler="single-threaded") . Assuming that you're making your arrays in such a way that Dask can load in chunks at a time then Dask will probably be able to load in a few chunks, do some of the computation, throw out intermediates, and then load in other chunks.您需要确保使用单线程调度程序( .compute(scheduler="single-threaded") 。假设您正在制作 arrays 以使 Dask 可以一次加载块,那么 Dask 将可能能够加载几个块,进行一些计算,丢弃中间体,然后加载其他块。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM