简体   繁体   中英

Typehinting: Why does python self-referencing require quotes?

According to PEP 484 ,

When a type hint contains names that have not been defined yet, that definition may be expressed as a string literal, to be resolved later.

We're presented with the following (illegal) code:

class Tree:
    def __init__(self, left: Tree, right: Tree):
        self.left = left
        self.right = right

This seems like an intuitive implementation of typehinting, and one would expect that since the symbol Tree is defined, python would know about it.

My working theory is that, although python is run line-by-line, the symbol Tree hasn't yet been added to the namespace dict because the Tree class is still in the process of being defined. Thus although the symbol exists, the object doesn't yet exist. Is this right? Is there more nuance than I'm giving it?

So, Python is run "line-by-line", but that is the bytecode . Let's consider the following example:

In [3]: class Foo:
   ...:     def bar(self) -> Foo:
   ...:         return Foo()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [3], in <module>
----> 1 class Foo:
      2     def bar(self) -> Foo:
      3         return Foo()

Input In [3], in Foo()
      1 class Foo:
----> 2     def bar(self) -> Foo:
      3         return Foo()

NameError: name 'Foo' is not defined

And let's look at the bytecode dissasembly:

In [5]: dis.dis(
   ...:     """
   ...: class Foo:
   ...:     def bar(self) -> Foo:
   ...:         return Foo()
   ...: """
   ...: )
  2           0 LOAD_BUILD_CLASS
              2 LOAD_CONST               0 (<code object Foo at 0x10b26bc90, file "<dis>", line 2>)
              4 LOAD_CONST               1 ('Foo')
              6 MAKE_FUNCTION            0
              8 LOAD_CONST               1 ('Foo')
             10 CALL_FUNCTION            2
             12 STORE_NAME               0 (Foo)
             14 LOAD_CONST               2 (None)
             16 RETURN_VALUE

Disassembly of <code object Foo at 0x10b26bc90, file "<dis>", line 2>:
  2           0 LOAD_NAME                0 (__name__)
              2 STORE_NAME               1 (__module__)
              4 LOAD_CONST               0 ('Foo')
              6 STORE_NAME               2 (__qualname__)

  3           8 LOAD_NAME                3 (Foo)
             10 LOAD_CONST               1 (('return',))
             12 BUILD_CONST_KEY_MAP      1
             14 LOAD_CONST               2 (<code object bar at 0x10b264d40, file "<dis>", line 3>)
             16 LOAD_CONST               3 ('Foo.bar')
             18 MAKE_FUNCTION            4 (annotations)
             20 STORE_NAME               4 (bar)
             22 LOAD_CONST               4 (None)
             24 RETURN_VALUE

Disassembly of <code object bar at 0x10b264d40, file "<dis>", line 3>:
  4           0 LOAD_GLOBAL              0 (Foo)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE

Look at the top level, the one that begins with LOAD_BUILD_CLASS . That essentially puts __build_class__ on the top of the stack, which is the helper function which is used to actually create class objects. If you want to understand the details of what happens in the function, start reading from here . But for our purposes, it basically chooses the correct metaclass (so type unless you specify otherwise, or inherit from a class that does), and prepares the class namespace, and does class_object = metaclass(name, bases, namespace, **kwds) . For the curious, the class namespace is essentially created by initializing it, namespace = {} then exec(body, globals(), namespace) (although not exactly, again, read more of the data model documentation for the nitty-gritty details).

But basically, as you can see, it isn't until after the class creation function is called that the variable is created:

    12 STORE_NAME               0 (Foo) 

Now, the:

    3           8 LOAD_NAME                3 (Foo)

Which is used to create the return annotation will search the global namespace for a Foo , and doesn't find one yet, and it raises an error!

Note, you can use from __future__ import annotations and this will work , essentially, the evaluation of an annotation is postponed, the annotation is stored essentially as a string in annotations. Note, dis does not actually seem to give a different output when I try it after using from __future__ import annotations but that might just be a limitation of the dis module .

If we use the __future__ import in the code passed to dis , we see:

In [6]: dis.dis(
   ...:     """
   ...: from __future__ import annotations
   ...: class Foo:
   ...:     def bar(self) -> Foo:
   ...:         return Foo()
   ...: """
   ...: )
  2           0 LOAD_CONST               0 (0)
              2 LOAD_CONST               1 (('annotations',))
              4 IMPORT_NAME              0 (__future__)
              6 IMPORT_FROM              1 (annotations)
              8 STORE_NAME               1 (annotations)
             10 POP_TOP

  3          12 LOAD_BUILD_CLASS
             14 LOAD_CONST               2 (<code object Foo at 0x10819c2f0, file "<dis>", line 3>)
             16 LOAD_CONST               3 ('Foo')
             18 MAKE_FUNCTION            0
             20 LOAD_CONST               3 ('Foo')
             22 CALL_FUNCTION            2
             24 STORE_NAME               2 (Foo)
             26 LOAD_CONST               4 (None)
             28 RETURN_VALUE

Disassembly of <code object Foo at 0x10819c2f0, file "<dis>", line 3>:
  3           0 LOAD_NAME                0 (__name__)
              2 STORE_NAME               1 (__module__)
              4 LOAD_CONST               0 ('Foo')
              6 STORE_NAME               2 (__qualname__)

  4           8 LOAD_CONST               0 ('Foo')
             10 LOAD_CONST               1 (('return',))
             12 BUILD_CONST_KEY_MAP      1
             14 LOAD_CONST               2 (<code object bar at 0x10819c240, file "<dis>", line 4>)
             16 LOAD_CONST               3 ('Foo.bar')
             18 MAKE_FUNCTION            4 (annotations)
             20 STORE_NAME               3 (bar)
             22 LOAD_CONST               4 (None)
             24 RETURN_VALUE

Disassembly of <code object bar at 0x10819c240, file "<dis>", line 4>:
  5           0 LOAD_GLOBAL              0 (Foo)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE

And note, that "line" becomes:

      4           8 LOAD_CONST               0 ('Foo')

Anyway, read more about that here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM