In general, is there any way to trace or debug the python import process , eg to understand where cpython has and won't search for modules (and why)? Especially when dealing with relative imports, subpackages, scripts inside packages, and different ways to invoke them (such as whether the current working directory is the inside or outside of the package)?
For example, the following behaviour (tested on conda-forge python 3.6.7) looks like a bug to me. ( Update: this particular example was subsequently fixed in later releases of python. Nonetheless, the debugging techniques may still be relevant more broadly, as well as providing insight into how the language operates.)
>>> from curses import textpad
>>> from . import textpad # <-- expected to fail?
>>> from . import ascii
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'ascii'
>>> from curses import ascii
>>> from . import textpad
>>> from . import ascii
>>>
This behaviour is a bug in cpython.
Every python import
statement is translated into one or more calls to the __import__
built-in python function. (This is documented and can be intercepted.)
In cpython there are two implementations of __import__
: there is the python reference implementation (in the importlib
standard library), and there is a C implementation (which can be accessed or intercepted via the builtins
standard library) which is invoked by default.
Here is a script which explores the issue (note curses.ascii
and curses.textpad
are some modules in the python standard library):
commands = ['from curses import ascii',
'from . import ascii',
'from . import textpad']
def mock(name, globals=None, locals=None, fromlist=(), level=0):
print(' __import__ :', repr(name), ':', fromlist, ':', level)
return alternate(name, globals, locals, fromlist, level)
import builtins
import importlib._bootstrap
original = builtins.__import__
builtins.__import__ = mock
for implementation in ['original', 'importlib._bootstrap.__import__']:
print(implementation.upper(), '\n')
alternate = eval(implementation)
try:
for command in commands:
print(command)
exec(command)
except ImportError as err:
print(' ', repr(err), '\n\n')
The output demonstrates that the cpython built-in, unlike the reference implementation, fails to check for a parent package before attempting a relative import:
ORIGINAL
from curses import ascii
__import__ : 'curses' : ('ascii',) : 0
__import__ : '_curses' : ('*',) : 0
__import__ : 'os' : None : 0
__import__ : 'sys' : None : 0
from . import ascii
__import__ : '' : ('ascii',) : 1
from . import textpad
__import__ : '' : ('textpad',) : 1
ImportError("cannot import name 'textpad'",)
IMPORTLIB._BOOTSTRAP.__IMPORT__
from curses import ascii
__import__ : 'curses' : ('ascii',) : 0
from . import ascii
__import__ : '' : ('ascii',) : 1
ImportError('attempted relative import with no known parent package',)
In cpython, the from [...][X] import Y [as Z]
statement is translated into two chief bytecode instructions (plus some housekeeping instructions, to appropriately load and save between the stack and the lists of constants/variables):
IMPORT_NAME
: This performs a call to builtins.__import__
. The call arguments are the instruction argument (the name X
of the module to return), some current state of the interpreter frame ( globals()
and locals()
), and two items lifted from the stack (the list Y
which may contain submodules to import, and the relative level ie the number of [...]
). The call is expected to return a module object, which is placed on the stack.IMPORT_FROM
: This inspects the module on top of the stack, and gets from its attribute Y
an object (which it also leaves on the stack). (These are documented alongside the dis
library and implemented in ceval.c
.)
If we try to from . import foo
from . import foo
(ie X
is blank and the level is 1) then IMPORT_NAME
tries to return the module object for the current parent package (eg whatever is named by the __package__
global). If this has no attribute named foo
then IMPORT_FROM
raises an ImportError
.
In an interactive interpreter shell or a simple script, __package__
is None
. In this circumstance:
importlib.__import__
would have raised an ImportError
(attempted relative import with no known parent package), but builtins.__import__
returns the module __main__
(built-in), which is the python top-level script environment. This is the key difference. Since all globals are attributes of the __main__
module, this misbehaviour results:
>>> foo = 'oops'
>>> from . import foo as fubar
>>> fubar
'oops'
There is also another misbehaviour: if attempting a deeper level of relative import (beyond the top-level package eg from ..... import foo
) then builtins.__import__
raises a ValueError
(instead of the expected ImportError
).
Update : Both bugs explored here were subsequently fixed in cpython (see bpo-37409 ). Aside from above insight into how python syntax relates to python bytecode instructions, setting builtins.__import__ = importlib.__import__
(to use the native reference implementation) should facilitate stepping through any import process with ordinary python debuggers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.