正在加载对于 RAM 来说太大的.npy 文件

Question

I am trying to load a large.npy file (~800 MB) into Google Collab, but every time I try to do so, the Collab instance crashes due to RAM over usage.我正在尝试将一个 large.npy 文件（~800 MB）加载到 Google Collab 中，但每次我尝试这样做时，Collab 实例都会由于 RAM 过度使用而崩溃。

     import numpy as np
     a=np.load('oddata.npy',allow_pickle=True)

I am using the basic Collab instance with 12 GB RAM.我正在使用具有 12 GB RAM 的基本 Collab 实例。

I have tried using mmap, but it returns this error: ValueError: Array can't be memory-mapped: Python objects in dtype.我曾尝试使用 mmap，但它返回此错误：ValueError: Array can't be memory-mapped: Python objects in dtype。

Is there anyway around the problem, like breaking the.npy file into chunks or converting it into another file format.问题是否存在，例如将.npy文件分成块或将其转换为另一种文件格式。

Best,最好的，

Araf阿拉夫

Answer 1

I think there is more to your problem than insufficient memory.我认为您的问题不仅仅是 memory 不足。 The 12 GB allocated in your Collab instance should be more than enough to read a 800MB file. Collab 实例中分配的 12 GB 应该足以读取 800MB 的文件。 To confirm, I ran a simple test on my Raspberry Pi (which only has 4GB RAM).为了确认，我在我的 Raspberry Pi（只有 4GB RAM）上运行了一个简单的测试。 It can create a 1GB.npy file and read it back in to a new array.它可以创建一个 1GB.npy 文件并将其读回新数组。 Here is the code:这是代码：

import numpy as np
nimg, n0, n1 = 1000, 512, 512
arr = np.arange(nimg*n0*n1).reshape(nimg,n0,n1)
print(arr.dtype, arr.shape)

np.save('SO_67671598.npy',arr)
arr2 = np.load('SO_67671598.npy')    
print(arr2.dtype, arr2.shape)

I get the same result with or without the allow_pickle=True parameter.无论是否使用allow_pickle=True参数，我都会得到相同的结果。 Note that allow_pickle=True is not recommended (for security reasons).请注意，不建议使用allow_pickle=True （出于安全原因）。 It is necessary when you are loading object arrays.当您加载 object arrays 时，这是必需的。 I suggest you run this test in your Collab instance, and see what you get.我建议你在你的 Collab 实例中运行这个测试，看看你会得到什么。

正在加载对于 RAM 来说太大的.npy 文件

问题描述

1 个解决方案

解决方案1
0 2021-05-25 18:12:22

正在加载对于 RAM 来说太大的.npy 文件

问题描述

1 个解决方案

解决方案1 0 2021-05-25 18:12:22

解决方案1
0 2021-05-25 18:12:22