简体   繁体   中英

RecursionError when python copy.deepcopy

I have a problem in python.

I have a class whis custom __getattr__

class ChoiceNumToName(object):
    def __init__(self, django_choice_tuple):
        self.ods_choice_tuple = django_choice_tuple
        self.choice_data = {}
        self.choice_point = -1
        for choice_value, choice_name in django_choice_tuple:
            self.choice_data.setdefault(choice_name, choice_value)

    def __getattr__(self, item):
        if item in self.choice_data:
            return self.choice_data[item]
        else:
            raise AttributeError("no attribute %s" % item)

    def __str__(self):
        return str(self.ods_choice_tuple)

    def __iter__(self):
        self.choice_point = -1
        return self

    def __next__(self):
        self.choice_point += 1
        try:
            return self.ods_choice_tuple[self.choice_point]
        except IndexError:
            raise StopIteration()

when I execute this

a = ChoiceNumToName((
    (1, "running"),
    (2, "stopped"),
))
b = copy.deepcopy(a)

It raise RecursionError: maximum recursion depth exceeded while calling a Python object

To fix this problem is sample, change __getattr__ function to this

def __getattr__(self, item):
    if item == "__setstate__":
        raise AttributeError(item)
    if item in self.choice_data:
        return self.choice_data[item]
    else:
        raise AttributeError("no attribute %s" % item)

It work well.

I know this solution from here https://github.com/python-babel/flask-babel/commit/8319a7f44f4a0b97298d20ad702f7618e6bdab6a

But can anyone tell me why?

TLDR: your __getattr__ is called before choice_data has been added to the instance dictionary which causes it to endlessly recurse. A better way to address the problem is to immediately raise AttributeError for any attributes beginning with __ to catch any other special or internal attributes.

This happens because when an object is copied the __init__ method is not called. Rather, a new, empty object is created. This new object has an empty __dict__ . Python's pickle protocol (which is also used for the copy module) has a hook __setstate__ that allows customization of applying a state (normally just the contents for __dict__ but, eg if __getstate__ is provided, it can be any object). To see if that hook is present hasattr(newobj, '__setstate__') is called which, since there isn't any __setstate__ in the MRO nor in the __dict__ causes your __getattr__ to be called. Your __getattr__ then tries to access self.choice_data but, as we've noted earlier the __dict__ is currently empty. This causes the __getattr__ method to be invoked again to get the choice_data attribute which starts the infinite recursion.

Special casing __setstate__ stops the recursion from being triggered by bailing out early for the lookup of __setstate__ . When that fails, the default mechanism for copying takes effect which initializes the new object's __dict__ from the state. In my mind special casing only __setstate__ is not the best solution. I think it is best to immediately raise AttributeError for any special or internal attributes, ie the ones that begin with __ , since that prevents other strange situations from occurring. Another possibility is to avoid using attribute lookup within __getattr__ by writing self.__dict__['choice_data'] or object.__getattribute__(self, 'choice_data') . You can also ensure that choice_data will be present by implementing __new__ and assigning it to the object there.

The methods __getstate__ and __setstate__ are used in pickling operations. Why does this matter? From the Python docs on copying :

Classes can use the same interfaces to control copying that they use to control pickling.

By defining a __setstate__ that refers to itself you've created a recursive object, hence the RecursionError.

I just hit this issue myself. Basically, any class with __getattr__ that tries to access a member variable will hit an infinite recursion and throw a RecursionError when using copy.copy or copy.deepcopy . Geoff Reedy's answer explains it pretty well, but I found a nicer writeup here .

Interstingly, this did not happen in Python 2, because of a bug in hasattr() that caused it to return False on any exception.

Adding the check against __setstate__ will solve the specific issue, but a more general approach is to guard against the state variable itself.

In your case, that would be:

def __getattr__(self, item):
    if item == "choice_data":  # Object state not initialized
        raise AttributeError(item)
    if item in self.choice_data:
        return self.choice_data[item]
    else:
        raise AttributeError("no attribute %s" % item)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM