Python module globals with multiple imports

Question

What code is run, and what is not, when a module is imported in python?

What code is run, and what is not, when a module is imported for the second time in python?

module1.py :

GLOBAL_VAR = 'orig'

print('module1: GLOBAL_VAR = {}'.format(GLOBAL_VAR))

def init():
    global GLOBAL_VAR
    print('module1:init(1): GLOBAL_VAR = {}'.format(GLOBAL_VAR))
    GLOBAL_VAR = 'changed'
    print('module1:init(2): GLOBAL_VAR = {}'.format(GLOBAL_VAR))

module2.py :

print('module2: importing module1')
import module1

print('module2(1): module1.GLOBAL_VAR = {}'.format(module1.GLOBAL_VAR))

module1.init()

print('module2(2): module1.GLOBAL_VAR = {}'.format(module1.GLOBAL_VAR))

module3.py :

print('module3: importing module1')
import module1

print('module3(1): module1.GLOBAL_VAR = {}'.format(module1.GLOBAL_VAR))

main.py :

import module2
import module3

Output :

python3 main.py

module2: importing module1
module1: GLOBAL_VAR = orig
module2(1): module1.GLOBAL_VAR = orig
module1:init(1): GLOBAL_VAR = orig
module1:init(2): GLOBAL_VAR = changed
module2(2): module1.GLOBAL_VAR = changed
module3: importing module1
module3(1): module1.GLOBAL_VAR = changed

Basically, the "freestanding" - not in a function, not in a class - code runs only once. I would like to know more about this, how this works, what are the limitations, especially, when is this not true?

My hunch is, that imported modules, even if they are imported from different modules, are registered at "per interpreter" level, and the interpreter knows if the code within the module is already run, and after that, it maintains the current state of any module in an object, and every importer gets that maintained object.

But what can mess it up? What if I use threads, and a second module imports the X module, but X module has a very long code to execute, and did not finish by the time the second import gets a timeslot? What will become of this whole system, if I am using multiprocessing?

Unfortunately I did not find a good explanation.

So, I already tested how it works in a basic setup, I already know that much, my question is why does it work so, what is the underlying mechanism?

Answer 1

You are correct in stating that all top-level code is executed upon being imported for the first time. This includes function definitions (but not their body) which bind the function's name to the function object.

As soon as a module is imported, it is stored in sys.modules . This is a dict mapping module names to the module object. So, after import module_a you could refer to it as sys.modules['module_a'] . You can even delete it from the dict (see What does "del sys.modules[module]" actually do? for consequences of doing so). If you don't delete the module from sys.modules , all future imports simply get the object from there. You can find a detailed description of the import system here: https://docs.python.org/3/reference/import.html

As for multi-threading, this is usually a non-issue because of the global interpreter lock (GIL): https://docs.python.org/3/c-api/init.html?highlight=gil#thread-state-and-the-global-interpreter-lock

Python module globals with multiple imports

Question

1 answers

solution1
0 2019-06-19 11:16:11

Python module globals with multiple imports

Question

1 answers

solution1 0 2019-06-19 11:16:11

solution1
0 2019-06-19 11:16:11