简体   繁体   中英

Python , variable store in memory

a=[1234,1234] #list

a      
[1234, 1234] 

id(a[0])      
38032480

id(a[1])      
38032480

b=1234 #b is a variable of integer type

id(b)      
38032384

Why id(b) is not same as id(a[0]) and id(a[1]) in python ?

When the CPython REPL executes a line, it will:

  1. parse, and compile it to a code object of bytecode, and then
  2. execute the bytecode.

The compilation result can be checked through the dis module :

>>> dis.dis('a = [1234, 1234, 5678, 90123, 5678, 4321]')
  1           0 LOAD_CONST               0 (1234)
              2 LOAD_CONST               0 (1234)
              4 LOAD_CONST               1 (5678)
              6 LOAD_CONST               2 (90123)
              8 LOAD_CONST               1 (5678)
             10 LOAD_CONST               3 (4321)
             12 BUILD_LIST               6
             14 STORE_NAME               0 (a)
             16 LOAD_CONST               4 (None)
             18 RETURN_VALUE

Note that all 1234s are loaded with " LOAD_CONST 0 ", and all 5678s are are loaded with " LOAD_CONST 1 ". These refer to the constant table associated with the code object. Here, the table is (1234, 5678, 90123, 4321, None) .

The compiler knows that all the copies of 1234 in the code object are the same , so will only allocate one object to all of them.

Therefore, as OP observed, a[0] and a[1] do indeed refer to the same object: the same constant from the constant table of the code object of that line of code.

When you execute b = 1234 , this will again be compiled and executed, independent of the previous line, so a different object will be allocated.

(You may read http://akaptur.com/blog/categories/python-internals/ for a brief introduction for how code objects are interpreted)


Outside of the REPL, when you execute a *.py file, each function is compiled into separate code objects, so when we run:

a = [1234, 1234]
b = 1234
print(id(a[0]), id(a[1]))
print(id(b))

a = (lambda: [1234, 1234])()
b = (lambda: 1234)()
print(id(a[0]), id(a[1]))
print(id(b))

We may see something like:

4415536880 4415536880
4415536880
4415536912 4415536912
4415537104
  • The first three numbers share the same address 4415536880, and they belong to the constants of the "__main__" code object
  • Then a[0] and a[1] have addresses 4415536912 of the first lambda.
  • The b has address 4415537104 of the second lambda.

Also note that this result is valid for CPython only. Other implementations have different strategies on allocating constants. For instance, running the above code in PyPy gives:

19745 19745
19745
19745 19745
19745

There is no rule or guarantee stating that the id(a[0]) should be equal to the id(a[1]), so the question itself is moot. The question you should be asking is why id(a[0]) and id(a[1]) are in fact the same.
If you do a.append(1234) followed by id(a[2]) you may or may not get the same id. As @hiro protagonist has pointed out, these are just internal optimizations that you shouldn't depend upon.

A Python list is very much unlike a C array.

AC array is just a block of contiguous memory, so the address of its first (0-th) element is the address of the array itself, by definition. Array access in C is just pointer arithmetic, and the [] notation is just a thin crust of syntactic sugar over that pointer arithmetic. An expression int x[] is just another form of int * x .

For the sake of the example, let's assume that in in Python, id(x) is a "memory address of X", as *x would be in C. (This is not true for all Python implementations, and not even guaranteed in CPython. It's just an unique number.)

In C, an int is just an architecture-dependent number of bytes, so for int x = 1 the expression *x points to these bytes. Everything in Python is an object, including numbers . This is why id(1) refers to an object of type int describing number 1 . You can call its methods: (1).__str__() will return a string '1' .

So, when you have x = [1, 2, 3] , id(x) is a "pointer" to a list object with three elements. The list object itself is pretty complex. But x[0] is not the bytes that comprise the integer value 1; it's internally a reference to an int object for number 1. Thus id(x[0]) is a "pointer" to that object.

In C terms, the elements of the array could be seen as pointers to the objects stored in it, not the objects themselves.

Since there's no point to have two objects representing the same number 1, id(1) is always the same during a Python interpreter run. An illustration:

x = [1, 2, 3]
y = [1, 100, 1000]

assert id(x) != id(y)  # obviously
assert id(x[0]) == id(y[0]) == id(1) # yes, the same int object

CPython actually preallocates objects for a few most-used small numbers ( see comments here ). For larger numbers, it's not so, which can lead to two 'copies' of a larger number having different id() values .

You must note that: id() actually gives id of the value of variables or literals. For every literal/value that is used in your program (even when within the id() itself), id() returns (attempts to return) an unique identifier for the literal/variable within the program life-cycle. This can be used by:

  • User: to check if two objects/variables are the same as in: a is b
  • Python: to optimise memory ie avoid unwanted duplications of same stuff in memory

As for your case, it isn't even guaranteed that a[0] and a[1] will give the same id though the value of both can be the same. It depends on the order/chronology of creation of literals/variables in the python program lifecycle and internally handled by python.

Case 1:

Type "help", "copyright", "credits" or "license" for more information.
>>> a=[1234,1234] 
>>> id(a[0])
52687424
>>> id(a[1])
52687424

Case 2 (Note that at the end of case , a[0] and a[1] have same value but different ids):

Type "help", "copyright", "credits" or "license" for more information.
>>> a=[1,1234]
>>> id(1)
1776174736
>>> id(1234)
14611088
>>> id(a[0])
1776174736
>>> id(a[1])
14611008
>>> a[0]=1234
>>> id(1234)
14611104
>>> id(a[0])
14611152
>>> id(a[1])
14611008
>>>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM