I'm a little puzzled how Python allocates memory and garbage-collects, and how that is platform-specific. For example, When we compare the following two code snippets:
Snippet A:
>>> id('x' * 10000000) == id('x' * 10000000)
True
Snippet B:
>>> x = "x"*10000000
>>> y = "x"*10000000
>>> id(x) == id(y)
False
Snippet A returns true because when Python allocates memory, it allocates it in the same location for the first test, and in different locations in the second test, which is why their memory locations are different.
But apparently system performance or platform impacts this, because when I try this on a larger scale:
for i in xrange(1, 1000000000):
if id('x' * i) != id('x' * i):
print i
break
A friend on a Mac tried this, and it ran until the end. When I ran it on a bunch of Linux VMs, it would invariably return (but at different times) on different VMs. Is this because of the scheduling of the garbage collection in Python? Was it because my Linux VMs had less processing speed than the Mac, or does the Linux Python implementation garbage-collect differently?
The garbage collector just uses whatever space is convenient. There are lots of different garbage collection strategies, and things are also affected by paramters, different platforms, memory usage, phase of the moon etc. Trying to guess how the interpreter will happen to allocate particular objects is just a waste of time.
It happens because python caches small integers and strings :
large strings : stored in variables not cached:
In [32]: x = "x"*10000000
In [33]: y = "x"*10000000
In [34]: x is y
Out[34]: False
large strings : not stored in variables, looks like cached:
In [35]: id('x' * 10000000) == id('x' * 10000000)
Out[35]: True
small strings : cached
In [36]: x="abcd"
In [37]: y="abcd"
In [38]: x is y
Out[38]: True
small integers: Cached
In [39]: x=3
In [40]: y=3
In [41]: x is y
Out[41]: True
large integers:
stored in variables: not cached
In [49]: x=12345678
In [50]: y=12345678
In [51]: x is y
Out[51]: False
not stored: cached
In [52]: id(12345678)==id(12345678)
Out[52]: True
CPython uses two strategies for memory management:
Allocation is in general done via the platforms malloc/free functions and inherits the performance characteristics of the underlaying runtime. If memory is reused is decided by the operating system. (There are some objects, which are pooled by the python vm)
Your example does, however, not trigger the 'real' GC algorithm (this is only used to collect cycles). Your long string gets deallocated as soon as the last reference is dropped.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.