How to trace python relative imports?

Question

In general, is there any way to trace or debug the python import process , eg to understand where cpython has and won't search for modules (and why)? Especially when dealing with relative imports, subpackages, scripts inside packages, and different ways to invoke them (such as whether the current working directory is the inside or outside of the package)?

For example, the following behaviour (tested on conda-forge python 3.6.7) looks like a bug to me. ( Update: this particular example was subsequently fixed in later releases of python. Nonetheless, the debugging techniques may still be relevant more broadly, as well as providing insight into how the language operates.)

>>> from curses import textpad
>>> from . import textpad # <-- expected to fail?
>>> from . import ascii
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'ascii'
>>> from curses import ascii
>>> from . import textpad
>>> from . import ascii
>>>

Answer 1

This behaviour is a bug in cpython.

Every python import statement is translated into one or more calls to the __import__ built-in python function. (This is documented and can be intercepted.)

In cpython there are two implementations of __import__ : there is the python reference implementation (in the importlib standard library), and there is a C implementation (which can be accessed or intercepted via the builtins standard library) which is invoked by default.

Here is a script which explores the issue (note curses.ascii and curses.textpad are some modules in the python standard library):

commands = ['from curses import ascii', 
            'from . import ascii', 
            'from . import textpad']

def mock(name, globals=None, locals=None, fromlist=(), level=0):
    print('    __import__ :', repr(name), ':', fromlist, ':', level)
    return alternate(name, globals, locals, fromlist, level)

import builtins
import importlib._bootstrap
original = builtins.__import__
builtins.__import__ = mock

for implementation in ['original', 'importlib._bootstrap.__import__']:
    print(implementation.upper(), '\n')
    alternate = eval(implementation)
    try:    
        for command in commands:
            print(command)
            exec(command)
    except ImportError as err:
        print('   ', repr(err), '\n\n')

The output demonstrates that the cpython built-in, unlike the reference implementation, fails to check for a parent package before attempting a relative import:

ORIGINAL 

from curses import ascii
    __import__ : 'curses' : ('ascii',) : 0
    __import__ : '_curses' : ('*',) : 0
    __import__ : 'os' : None : 0
    __import__ : 'sys' : None : 0
from . import ascii
    __import__ : '' : ('ascii',) : 1
from . import textpad
    __import__ : '' : ('textpad',) : 1
    ImportError("cannot import name 'textpad'",) 


IMPORTLIB._BOOTSTRAP.__IMPORT__ 

from curses import ascii
    __import__ : 'curses' : ('ascii',) : 0
from . import ascii
    __import__ : '' : ('ascii',) : 1
    ImportError('attempted relative import with no known parent package',)

In cpython, the from [...][X] import Y [as Z] statement is translated into two chief bytecode instructions (plus some housekeeping instructions, to appropriately load and save between the stack and the lists of constants/variables):

IMPORT_NAME : This performs a call to builtins.__import__ . The call arguments are the instruction argument (the name X of the module to return), some current state of the interpreter frame ( globals() and locals() ), and two items lifted from the stack (the list Y which may contain submodules to import, and the relative level ie the number of [...] ). The call is expected to return a module object, which is placed on the stack.
IMPORT_FROM : This inspects the module on top of the stack, and gets from its attribute Y an object (which it also leaves on the stack).

(These are documented alongside the dis library and implemented in ceval.c .)

If we try to from . import foo from . import foo (ie X is blank and the level is 1) then IMPORT_NAME tries to return the module object for the current parent package (eg whatever is named by the __package__ global). If this has no attribute named foo then IMPORT_FROM raises an ImportError .

In an interactive interpreter shell or a simple script, __package__ is None . In this circumstance:

importlib.__import__ would have raised an ImportError (attempted relative import with no known parent package), but
builtins.__import__ returns the module __main__ (built-in), which is the python top-level script environment.

This is the key difference. Since all globals are attributes of the __main__ module, this misbehaviour results:

>>> foo = 'oops'
>>> from . import foo as fubar
>>> fubar
'oops'

There is also another misbehaviour: if attempting a deeper level of relative import (beyond the top-level package eg from ..... import foo ) then builtins.__import__ raises a ValueError (instead of the expected ImportError ).

Update : Both bugs explored here were subsequently fixed in cpython (see bpo-37409 ). Aside from above insight into how python syntax relates to python bytecode instructions, setting builtins.__import__ = importlib.__import__ (to use the native reference implementation) should facilitate stepping through any import process with ordinary python debuggers.

How to trace python relative imports?

Question

1 answers

solution1
0 2019-06-26 08:09:23

How to trace python relative imports?

Question

1 answers

solution1 0 2019-06-26 08:09:23

solution1
0 2019-06-26 08:09:23