简体   繁体   中英

What does class `type.__init__` and `type.__new__` do when creating a new `type` instance(i.e. class)?

As is known to all, there are many jobs are done behind the scenes when creating a new class in Python, such as setting attributes like __dict__ , __class__ , __metaclass__ , etc.

I know that when creating a new class, the type.__new__ method will be called, and type.__init__ will also be called on condition that call.__new__ returns a instance of type . So I guess these two methods might be in charge of some of the work, but I cannot find any description in docs about their real function. My question is, what exactly do these two methods do for making a class ?

EDIT:

I know what a metaclass is and what roughly metaclass does in the process of creating a type instance, but I am wondering about how these two methods cooperate to achieve the job of creating a type instance. Maybe @BrenBarn is right that this is implementation related. And I just want to make sure about that. For example, if I rewrite the __new__ method in my own metaclass T , and return type(clsname, bases, dct) directly instead of calling __new__ in base class type as what people usually do, then neither T.__init__ nor type.__init__ will be called, since returned object is not an instance of T . If so, what am I expecting to miss owing to the absense of __init__ ? And also, can I expect that to be a consistent behavior across various implementations?

am wondering about how these two methods cooperate to achieve the job of creating a type instance

The type.__init__ method is only responsible for checking that there are 1 or 3 arguments and that there are no keyword arguments. The C source code for this is in the type_init() function in Objects/typeobject.c .

The type.__new__ method does all of the rest work in creating a new class. Here are the steps for type_new_impl in Objects/typeobject.c :

  • type_new_init()
  • type_new_set_attrs()
  • PyType_Ready()
  • fixup_slot_dispatchers()
  • type_new_set_names()
  • type_new_init_subclass()

It matters a lot whether you implement these calls in Python or in C. For example the call type(clsname, bases, dct) is the combination of tp_new and tp_init , but will call that __init__ if and only if it was setup as the tp_init pointer on the metaclass of the newly created type, not if that merely has an __init__ in its __dict__ . So, you have access to that from C, not from Python. Of course you are free to call __init__ explicitly yourself.

Similar problems exist in type_new for most of type slots, __init__ among them: for performance reasons, even though they are strings in the lookup, selection is on the pointer to the strings, so the call to __init__ only works if you use of interned strings in your implementation of tp_new . And yes, this behavior has subtly changed across Python versions, and in the case of 2.7, even across patch versions.

I'm pretty sure that these subtle problems are the reason why you cannot find much about them in the docs: if the behavior were documented, it would have to be supported, which would come at a performance cost for the common case where there is no user-provided tp_new .

As for the reason to have an __init__ called on freshly created type: it is only important if you have extra data members to initialize that can not be part of the dictionary handed to the constructor (the normal type_new will take care of those.) This is often the case when defining the new type in C, as the type may carry internal data that is not supposed to be python-visible, but uncommon when done from Python.

As a perhaps extreme example, consider Python-C++ cross-inheritance in cppyy ( http://cppyy.readthedocs.io/ ). All C++ proxy classes have an associated metaclass. When a Python class is derived from a C++ class and overrides C++ virtual methods, the metaclass gets called and it interjects a dispatcher class that is invisible to the Python-side: it is part of the C++ hierarchy, but not part of the Python hierarchy. This extra setup is performed after the normal type_new has run. (As an implementation detail, this is still all done in the tp_new method, to avoid problems with when tp_init is called and when not.) Note here that this behavior requires a hook when a derived class is created, which can not be done without a metaclass.

In general the Python-side code should, however, behave the same across implementations and not show the subtle problems of the C side. But underlying, there may be large differences, not just the subtle ones alluded to above. Eg PyPy does not support metaclasses in RPython at all, so in that case you have to embed Python code.

Personally, on the Python side, I have far less need for __new__ . In practice, it's a bit of a misnomer: if you just want to control the class creation and nothing more, implementing __call__ on a pretend metaclass provides the same end-user syntax with far less hassle. The only reason to use __new__ is if you want to control its behavior post-creation through its custom metaclass methods/properties.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM