简体   繁体   中英

Why doesn't the memory size change when you delete an item from the list?

Yeah, why doesn't the size change when you delete an item from the list?
Is there a way to change this behavior?

Python 2.7.5+ (default, Feb 27 2014, 19:37:08) 
>>> from sys import getsizeof
>>> x = [1, 2, 3, 4]
>>> print getsizeof(x), x
104 [1, 2, 3, 4]
>>> del x[3]
>>> print getsizeof(x), x
104 [1, 2, 3]

It usually doesn't change when you add an item either. This is because the implementation of list overallocates as an optimization. There is no way to change this other than to modify Python's source code.

getsizeof gets the memory consumption of the list object, not its length or anything like that. Deleting items doesn't cause the list to free memory unless you delete more than a certain threshold; it keeps that memory to hold future items, reducing allocation costs.

Also, getsizeof doesn't include the memory taken by the elements of the list, only the memory for the list header and the dynamic array of pointers.

If you want to trim the memory consumption, create a slice copy of the list:

>>> x = [1]*3
>>> del x[2]
>>> sys.getsizeof(x)
88
>>> sys.getsizeof(x[:])
80

This is generally not necessary, though, and doing it for every deletion is almost certainly a bad idea.

The list object doesn't resize the internal C array every time you delete; that'd be very inefficient.

The list object instead over-allocates new memory periodically as needed when adding elements, and when deleting only resizes when you deleted enough to fit in half the already allotted space. From the C code:

if (allocated >= newsize && newsize >= (allocated >> 1)) {
    assert(self->ob_item != NULL || newsize == 0);
    Py_SIZE(self) = newsize;
    return 0;
}

where newsize is the actual count of object references stored, allocated the size of over-allocated list object. The above test skips the re-allocation when newsize is still greater than or equal to half the allocated space.

Even when shrinking the list, the array is still given an over-allocation to receive new elements; there are always at least 3 slots empty in the list array.

As such sys.getsizeof() remains stable until you deleted enough elements:

>>> from sys import getsizeof
>>> x = [1, 2, 3, 4] * 3
>>> x
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]
>>> print getsizeof(x), x
168 [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]
>>> del x[-1]
>>> del x[-1]
>>> del x[-1]
>>> del x[-1]
>>> print getsizeof(x), x
168 [1, 2, 3, 4, 1, 2, 3, 4]
>>> del x[-1]
>>> del x[-1]
>>> del x[-1]
>>> del x[-1]
>>> print getsizeof(x), x
136 [1, 2, 3, 4]

In the other direction, when adding elements, list over-allocation grows the list in steps proportional to the size of the number of references stored:

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 */
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

See the list_resize() function of the listobject.c source code (Python 2.7 version).

Of course the actual len() output does reflect the number of items the list references.

The resizing of list follows certain pattern, from listobject.c source code:

...
/* Bypass realloc() when a previous overallocation is large enough
   to accommodate the newsize.  If the newsize falls lower than half
   the allocated size, then proceed with the realloc() to shrink the list.
*/
if (allocated >= newsize && newsize >= (allocated >> 1)) {
    assert(self->ob_item != NULL || newsize == 0);
    Py_SIZE(self) = newsize;
    return 0;
}

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 */
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
...

resizing immutably just because an item was append or removed will result in poor performance.

Best guess would be that python isn't freeing up the end of the list each time an item is removed from a list. It holds on to that space for a while in case it is needed again. It probably waits until there is a sufficient number of empty spaces to warrant clearing them in one big swoop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM