I've a huge numpy.ndarray
of images array1
that takes 60GB when loaded on the RAM. I need to remove the last n
elements of that array. An easy solution would be:
array1 = array1[:n-1]
But when I do it, I don't gain any space in the RAM, why is that? How can I gain free space in RAM because of removing these elements? I originally do the removal for that gain.
array1[:n-1]
is a view, a new array which shares the data buffer with the original array1
. Even though you reassign array1
, its data buffer is not resized.
array1.resize(n-1)
- the docs indicate that the data buffer is resized/reallocated, provided it is clear that this buffer is not shared with anything else.
In [1105]: arr=np.arange(1000)
In [1106]: arr.nbytes
Out[1106]: 4000
In [1107]: sys.getsizeof(arr) # those bytes plus overhead
Out[1107]: 4048
In [1108]: arr = arr[:500] # your slice
In [1109]: arr.nbytes # fewer bytes
Out[1109]: 2000
In [1110]: sys.getsizeof(arr) # just the overhead
Out[1110]: 48
sys.getsizeof
gets the size of the view, but since it shares the buffer with the original arr
, we only see the 'overhead'. The original arr
still exists, but it isn't accessible by name.
In [1111]: arr=np.arange(1000)
In [1112]: arr.resize(500)
In [1113]: arr.nbytes
Out[1113]: 2000
In [1114]: sys.getsizeof(arr)
Out[1114]: 2048
With resize
method it appears that the data buffer has been resized, freeing up half of it. But I'm not sure there's a good way of testing that, at least not for small arrays like this.
Potentially we have 3 systems managing memory - numpy, python interpreter, and system. We'd have to dig much further in to the code (possibly the C-api) to find out whether after resize
the memory is added to some sort of numpy
cache, or gets collected by the Python garbage collector or gets returned to the system.
============
resize
followed by a new shape
seems to reduce the size along the first axis:
In [1120]: arr = np.arange(100).reshape(10,10).copy()
In [1121]: arr.resize(50)
In [1122]: sys.getsizeof(arr)
Out[1122]: 248
In [1123]: arr = np.arange(100).reshape(10,10).copy()
In [1124]: sys.getsizeof(arr)
Out[1124]: 456
In [1125]: arr.resize(50)
In [1126]: sys.getsizeof(arr)
Out[1126]: 248
In [1127]: arr.shape
Out[1127]: (50,)
In [1128]: arr.shape=(5,10) # inplace reshape
In [1129]: arr
Out[1129]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.