Is avoiding expensive init a good reason to use new?

Question

In my project, we have a class based on set. It can be initialised from a string, or an iterable (eg tuple) of strings, or other custom classes. When initialised with an iterable it converts each item to a particular custom class if it is not one already.

Because it can be initialised from a variety of data structures a lot of the methods that operate on this class (such as __and__ ) are liberal in what they accept and just convert their arguments to this class (ie initialise a new instance). We are finding this is rather slow, when the argument is already an instance of the class, and has a lot of members (it is iterating through them all and checking that they are the right type).

I was thinking that to avoid this, we could add a __new__ method to the class and just if the argument passed in is already an instance of the class, return it directly. Would this be a reasonable use of __new__ ?

Answer 1

Adding a __new__ method will not solve your problem. From the documentation for __new__ :

If __new__() returns an instance of cls , then the new instance's __init__() method will be invoked like __init__(self[, ...]) , where self is the new instance and the remaining arguments are the same as were passed to __new__() .

In otherwords, returning the same instance will not prevent python from calling __init__ . You can verify this quite easily:

In [20]: class A:
    ...:     def __new__(cls, arg):
    ...:         if isinstance(arg, cls):
    ...:             print('here')
    ...:             return arg
    ...:         return super().__new__(cls)
    ...:     def __init__(self, values):
    ...:         self.values = list(values)

In [21]: a = A([1,2,3])

In [22]: A(a)
here
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-c206e38274e0> in <module>()
----> 1 A(a)

<ipython-input-20-5a7322f37287> in __init__(self, values)
      6         return super().__new__(cls)
      7     def __init__(self, values):
----> 8         self.values = list(values)

TypeError: 'A' object is not iterable

You may be able to make this work if you did not implement __init__ at all, but only __new__ . I believe this is what tuple does.

Also that behaviour would be acceptable only if your class is immutable (eg tuple does this), because the result would be sensible. If it is mutable you are asking for hidden bugs.

A more sensible approach is to do what set does: __*__ operations operate only on set s, however set also provides named methods that work with any iterable:

In [30]: set([1,2,3]) & [1,2]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-dfd866b6c99b> in <module>()
----> 1 set([1,2,3]) & [1,2]

TypeError: unsupported operand type(s) for &: 'set' and 'list'

In [31]: set([1,2,3]) & set([1,2])
Out[31]: {1, 2}

In [32]: set([1,2,3]).intersection([1,2])
Out[32]: {1, 2}

In this way the user can choose between speed and flexibility of the API.

A simpler approach is the one proposed by unutbu: use isinstance instead of duck-typing when implementing the operations.

Is avoiding expensive init a good reason to use new?

Question

1 answers

solution1
3 ACCPTED 2014-08-28 08:19:58

Is avoiding expensive __init__ a good reason to use __new__?

Question

1 answers

solution1 3 ACCPTED 2014-08-28 08:19:58

Is avoiding expensive init a good reason to use new?

solution1
3 ACCPTED 2014-08-28 08:19:58