简体   繁体   中英

Unable to deepcopy a class with both __init__ and __new__ defined

I'm having (what seems to me) a slightly weird problem. I have defined a class with both init and new defined, below:

class Test:

    def __init__(self, num1):
        self.num1 = num1

    def __new__(cls, *args, **kwargs):
        new_inst = object.__new__(cls)
        new_inst.__init__(*args, **kwargs)
        new_inst.extra = 2
        return new_inst

If put to normal use, this works fine:

test = Test(1)
assert test.extra == 2

However, it will not copy.deepcopy:

import copy
copy.deepcopy(test)

gives

TypeError: __init__() missing 1 required positional argument: 'num1'

This may be related to Decorating class with class wrapper and __new__ - I can't see exactly how but I'm trying a similar thing here - I need new to apply a class wrapper to the Test instance I've created.

Any help gratefully received!

Technically it's not an issue to call __init__ from __new__ , but it's redundant as a call to __init__ happens automatically once __new__ returns the instance.


Now coming to why deepcopy fails, we can look into its internals a bit.

When __deepcopy__ isn't defined on the class it falls to this condition:

reductor = getattr(x, "__reduce_ex__", None)
rv = reductor(4)

Now, here reductor(4) returns the function to be used to re-create the object , the type of the object( Test ), arguments to be passed and its state(in this case the items in instance dictionary test.__dict__ ):

>>> !rv
(
    <function __newobj__ at 0x7f491938f1e0>,  # func
    (<class '__main__.Test'>,),  # type + args in a single tuple
    {'num1': 1, 'extra': []}, None, None) # state

Now it calls _reconstruct with this data:

def _reconstruct(x, memo, func, args,
                 state=None, listiter=None, dictiter=None,
                 deepcopy=deepcopy):
    deep = memo is not None
    if deep and args:
        args = (deepcopy(arg, memo) for arg in args)
    y = func(*args)
    ...

Here this call will end up calling:

def __newobj__(cls, *args):
    return cls.__new__(cls, *args)

But since args is empty and cls being <class '__main__.Test'> , you get the error.


Now how does Python decides these arguments for your object, as that seem to be the problem?

For that we need to look into: reductor(4) , where reductor is __reduce_ex__ and the 4 passed here is pickle protocol version.

Now this __reduce_ex__ internally calls reduce_newobj to get the object creation function, arguments, state etc for the new copy to be made.

The arguments in itself are found out using _PyObject_GetNewArguments .

Now this function looks for __getnewargs_ex__ or __getnewargs__ on the class, since our class doesn't have it, we get nothing for arguments.


Now let's add this method and try again:

import copy


class Test:

    def __init__(self, num1):
        self.num1 = num1

    def __getnewargs__(self):
        return ('Eggs',)

    def __new__(cls, *args, **kwargs):
        print(args)
        new_inst = object.__new__(cls)
        new_inst.__init__(*args, **kwargs)
        new_inst.extra = []
        return new_inst

test = Test([])

xx = copy.deepcopy(test)

print(xx.num1, test.num1, id(xx.num1), id(test.num1))

# ([],)
# ('Eggs',)
# [] [] 139725263987016 139725265534088

Surprisingly the deepcopy xx doesn't have Eggs stored in num1 even though we're returning it from __getnewargs__ . This is because the function _reconstruct re-adds a deepcopy of the state it obtained originally to the instance after its creation, hence overriding these changes.


def _reconstruct(x, memo, func, args,
                 state=None, listiter=None, dictiter=None,
                 deepcopy=deepcopy):
    deep = memo is not None
    if deep and args:
        args = (deepcopy(arg, memo) for arg in args)
    y = func(*args)
    if deep:
        memo[id(x)] = y

    if state is not None:
        ...
            if state is not None:
                y.__dict__.update(state)  <---
    ...

Are there any other ways to do it?

Note the above explanation and the working function is just for explaining the issue. I wouldn't really call it the best or worse way to do it.

Yes, you could define you own__deepcopy__ hook on the class to control the behavior further. I'd leave this an exercise to the user.

OK - it's because I'm doing it wrong - I shouldn't explicitly call init from new . Culpe mea.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM