简体   繁体   中英

Memory after removing last N elements of numpy.ndarray

I've a huge numpy.ndarray of images array1 that takes 60GB when loaded on the RAM. I need to remove the last n elements of that array. An easy solution would be:

array1 = array1[:n-1]

But when I do it, I don't gain any space in the RAM, why is that? How can I gain free space in RAM because of removing these elements? I originally do the removal for that gain.

array1[:n-1] is a view, a new array which shares the data buffer with the original array1 . Even though you reassign array1 , its data buffer is not resized.

array1.resize(n-1) - the docs indicate that the data buffer is resized/reallocated, provided it is clear that this buffer is not shared with anything else.

In [1105]: arr=np.arange(1000)
In [1106]: arr.nbytes
Out[1106]: 4000
In [1107]: sys.getsizeof(arr)   # those bytes plus overhead
Out[1107]: 4048
In [1108]: arr = arr[:500]      # your slice
In [1109]: arr.nbytes           # fewer bytes
Out[1109]: 2000
In [1110]: sys.getsizeof(arr)   # just the overhead
Out[1110]: 48

sys.getsizeof gets the size of the view, but since it shares the buffer with the original arr , we only see the 'overhead'. The original arr still exists, but it isn't accessible by name.

In [1111]: arr=np.arange(1000)
In [1112]: arr.resize(500)
In [1113]: arr.nbytes
Out[1113]: 2000
In [1114]: sys.getsizeof(arr)
Out[1114]: 2048

With resize method it appears that the data buffer has been resized, freeing up half of it. But I'm not sure there's a good way of testing that, at least not for small arrays like this.

Potentially we have 3 systems managing memory - numpy, python interpreter, and system. We'd have to dig much further in to the code (possibly the C-api) to find out whether after resize the memory is added to some sort of numpy cache, or gets collected by the Python garbage collector or gets returned to the system.

============

resize followed by a new shape seems to reduce the size along the first axis:

In [1120]: arr = np.arange(100).reshape(10,10).copy()
In [1121]: arr.resize(50)
In [1122]: sys.getsizeof(arr)
Out[1122]: 248
In [1123]: arr = np.arange(100).reshape(10,10).copy()
In [1124]: sys.getsizeof(arr)
Out[1124]: 456
In [1125]: arr.resize(50)
In [1126]: sys.getsizeof(arr)
Out[1126]: 248
In [1127]: arr.shape
Out[1127]: (50,)
In [1128]: arr.shape=(5,10)   # inplace reshape
In [1129]: arr
Out[1129]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM