简体   繁体   中英

Mutable and Immutable Objects

I am trying to get my head around on mutable and immutable objects. I have read that string is immutable and that for each string, a separate object is created with a different object ID. I am trying to verify this using below simple code, however, I see same object ID for multiple strings which are not same. Can someone please clarify this. Thanks in advance.

mystring = ""
mylist = ["This ", "That ", "This ", "That ", "This ", "That ", "This ", "That "]

for item in mylist:
    mystring = mystring + item
    print("mystring: ", mystring, "ID of mystring: ", id(mystring))

which results in below output:

mystring:  This  ID of mystring:  6407264
mystring:  This That  ID of mystring:  42523448
mystring:  This That This  ID of mystring:  42523448
mystring:  This That This That  ID of mystring:  6417200
mystring:  This That This That This  ID of mystring:  42785608
mystring:  This That This That This That  ID of mystring:  42785608
mystring:  This That This That This That This  ID of mystring:  42837536
mystring:  This That This That This That This That  ID of mystring:  42775856

Python is allowed to reuse object IDs for objects with non-overlapping lifetimes, but you're seeing ID reuse in cases where there should be a lifetime overlap. Specifically, during execution of this statement:

mystring = mystring + item

between the evaluation of mystring + item and the assignment to mystring , there should be a lifetime overlap between any two successive values of mystring . You're seeing ID reuse for successive values of mystring , which shouldn't happen.

The effect you're seeing happens because of an optimization in the CPython bytecode evaluation loop, where statements of the form

string1 = string1 + string2

or

string1 += string2

are detected, and if the interpreter can confirm that string1 has no other references, it attempts to perform the concatenation by mutating string1 in-place. You can see the code in Python/ceval.c under unicode_concatenate . This optimization is mostly invisible, due to the refcount check, but the effect on id values is one way it's visible.

String are immutable. There exist no str method that allows to mutate them.

That being said, the reason you see the same id multiple times is because when an object is no longer in use, Python will reuse its position in memory. And what id does is precisely to provide a unique identifier by returning the position of the object in memory.

One way to convince yourself that this is indeed the reason for your observation would be to make sure to always have a reference to each of the string you create by adding them to a list .

Code

mystring = ""
mylist = ["This ", "That ", "This ", "That ", "This ", "That ", "This ", "That "]

# A list to keep a reference to each string
created_strings = []

for item in mylist:
    mystring = mystring + item

    # Prevent mystring from being garbage collected by adding it to the list
    created_strings.append(mystring)

    print("mystring: ", mystring, "ID of mystring: ", id(mystring))

Output

mystring:  This  ID of mystring:  2522900655888
mystring:  This That  ID of mystring:  2522903930416
mystring:  This That This  ID of mystring:  2522903930544
mystring:  This That This That  ID of mystring:  2522902118880
mystring:  This That This That This  ID of mystring:  2522900546624
mystring:  This That This That This That  ID of mystring:  2522900546864
mystring:  This That This That This That This  ID of mystring:  2522902428376
mystring:  This That This That This That This That  ID of mystring:  2522900907952

Notice that now that memory is not reclaimed, each object has a different id .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM