简体   繁体   中英

Python, why does passing and changing a class self-variable with an external function work for manipulating iterables but not variables

I ran into a very hard to track down bug in my program where a class self-iterable was manipulated by an external function and discovered that some self-variables can be changed and some can't. Is it possible to manipulate a single self-variable like an int with an external function without passing the entire class?

Here's some example code:

class TestClass(object):
    def __init__(self):
        self.my_var = 0
        self.my_str = "Foo"
        self.my_tuple = (1, 2, 3)
        self.my_list = [1, 2, 3]
        self.my_dict = {"one": 1, "two": 2, "three": 3}
        self.manipulate_1()
        self.manipulate_2()

    def manipulate_1(self):
        external_1(self.my_var, self.my_list, self.my_str, self.my_tuple, self.my_dict)
        print(self.my_var)
        print(self.my_list)
        print(self.my_str)
        print(self.my_tuple[0])
        print(self.my_dict["one"])
        #prints 0, 15, Foo, 1, 15
    def manipulate_2(self):
        external_2(self)
        print("\n" + str(self.my_var))
        # prints 1

def external_1(instance_var, instance_list, instance_str, instance_tuple, instance_dict):
    instance_var += 1
    del instance_list[0]
    del instance_list[0]
    instance_list[0] = 15
    instance_str = "Bar"
    list(instance_tuple)[0] = 15
    instance_dict.update({"one": 15})


def external_2(instance):
    instance.my_var += 1


a = TestClass()

The list can be manipulated by deleting entries just by passing it as an argument, while the variable can only be manipulated while passing self.

Is there a way to manipulate a single self-variable. If not, does passing self come with any performance issues or other issues? IE, if I want to manipulate a self-variable, is using a method mandatory?

Python's arguments passing works the same for all objects - the original object is passed (not "a copy of", not "a reference to", not "a pointer to" - it IS the object itself that is passed), regardless of the object's type, whether it's mutable or not etc. These objects are then bound to their matching parameter's names as local variables.

The difference you observe is actually the result of the difference between to totally distinct operations: rebinding a (local) name and mutating an object.

Since parameters are local variables (local names actually) rebinding a parameter in your function's body only make this name point to another object, and does not impact the original argument (except for decreasing the reference counter). So obviously this has absolutely no effect outside the function itself.

Now when you mutate one of your argument, since you are working on the very object you passed to the function, those changes are, very obviously, visible outside the function.

Here:

def external_1(instance_var, instance_list, instance_str, instance_tuple, instance_dict):
    # this one rebinds the local name `instance_var`
    # to a new `int` object. Doesn't affect the object
    # previously bound to `instance_var`
    instance_var += 1

    # those three statement mutate `instance_list`, 
    # so the effect is visible outside the function
    del instance_list[0]
    del instance_list[0]
    instance_list[0] = 15

    # this one rebinds the local name `instance_str`
    # to the literal string "Bar". Same as for `instance_var`
    instance_str = "Bar"

    # this one creates a list from `instance_tuple`, 
    # mutate this list, and discard it. IOW it eats a 
    # couple processor cycles for nothing.  
    list(instance_tuple)[0] = 15

    # and this one mutates `instance_dict` so the
    # effect is visible outside the function
    instance_dict.update({"one": 15})

And here:

def external_2(instance):
    # this one mutates `instance` - it's actually
    # syntactic sugar for 
    # `instance.__setattr__("my_var", instance.__getattribute__("my_var") + 1))`
    instance.my_var += 1

As I already mentionned a couple times in the comments, all this (and much more) is explained in full details in Ned Batchelder's reference article .

Please see the addendum.

This is normal and expected behavior. Its because of the difference between sending reference and sending values.

With external_1(self.my_var, self.my_list) :

  • You send self.my_var which is value . This means that external_1 receives only a value. That value is then local to the function, so the class has no way of knowing if it changed. Try this to see this working:

    def external_1(instance_var, instance_list): instance_var += 1 print('This will print one:', instance_var) del instance_list[0] del instance_list[0]

  • The variable 'self.my_list is an *reference*. This means you're sending the address of where to find the list. So the function is an *reference*. This means you're sending the address of where to find the list. So the function is an *reference*. This means you're sending the address of where to find the list. So the function external_1` will go to that address and change the list values there.

With external_2(self) you send a reference to external_2 of the entire class. So it does exactly the same as with self.my_list .

If you still don't fully understand, don't worry, it took me quite some time to understand these kind of references (or pointers). There is millions of tutorials and videos about how they work.

Addendum:

@bruno is correct when saying that I'm not explaining it correct in the technical sense, and how Python actually handles all the variables. I'm simply trying to explain what happens as an overview, and coming from the C world, it's how I understand it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM