When I try and save a very large (20000 x 20000 element) array, I get all zeros back:
In [2]: shape = (2e4,)*2
In [3]: r = np.random.randint(0, 10, shape)
In [4]: r.tofile('r.data')
In [5]: ls -lh r.data
-rw-r--r-- 1 whg staff 3.0G 23 Jul 16:18 r.data
In [6]: r[:6,:6]
Out[6]:
array([[6, 9, 8, 7, 4, 4],
[5, 9, 5, 0, 9, 4],
[6, 0, 9, 5, 7, 6],
[4, 0, 8, 8, 4, 7],
[8, 3, 3, 8, 7, 9],
[5, 6, 1, 3, 1, 4]])
In [7]: r = np.fromfile('r.data', dtype=np.int64)
In [8]: r = r.reshape(shape)
In [9]: r[:6,:6]
Out[9]:
array([[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
np.save() does similar strange things.
After searching the net, I found that there is a known bug in OSX:
https://github.com/numpy/numpy/issues/2806
When I try to to read the the tostring() data from a file using Python's read(), I get a memory error.
Is there a better way of doing this? Can anyone recommend a pragmatic workaround to this problem?
Use mmap
to memory-map the file, and np.frombuffer
to create an array that points into the buffer. Tested on x86_64 Linux:
# `r.data` created as in the question
>>> import mmap
>>> with open('r.data') as f:
... m = mmap.mmap(f.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ)
...
>>> r = np.frombuffer(m, dtype='int64')
>>> r = r.reshape(shape)
>>> r[:6, :6]
array([[7, 5, 9, 5, 3, 5],
[2, 7, 2, 6, 7, 0],
[9, 4, 8, 2, 5, 0],
[7, 2, 4, 6, 6, 7],
[2, 9, 2, 2, 2, 6],
[5, 2, 2, 6, 1, 5]])
Note that here r
is a view of memory-mapped data, which makes it more memory-efficient, but comes with the side effect of automatically picking up changes to the file contents. If you want it to point to a private copy of the data, as the array returned by np.fromfile
does, add an r = np.copy(r)
.
(Also, as written, this won't run under Windows, which requires slightly different mmap
flags.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.