简体   繁体   中英

Global variables accross modules and threads in python

I have a config file config.py which holds a global variable, ie in config.py I have (5 is the default)

# config.py
globalVar = 5

Now in a module run.py I'm setting the global variable and then I call a printing function:

# run.py
import config
import test
config.globalVar = 7
test.do_printing()

# test.py
import config
def do_printing():
  print(config.globalVar)

This works well (ie 7 is printed) but if I use multiple threads for printing (in test.py) it does not work anymore, ie then the threads do not see the change made by run.py (ie 5 is printed).

How can this be solved?

Even when running on the same thread you might have issues doing that. For example, if you do from config import globalVar instead, if you rebind globalVar in the local module, it just looses the reference to the object in the config module.

And even if you don't do that, if changes to the variable take place at import time of your various modules, it is very hard to keep track of the actual import order.

When you add threads, that just becomes 100% unmanageable, due to all sorts of race conditions. Other than a race condition (ie one of your threads reads the variable before it has been set on the other thread), or incorrect importing, threads should not affect the visibility of global variable changes in the way you describe.

The solution for having deterministic code is to use data structures that are appropriate for that interchange across threads (and data protection across threads).

The threading module itself offers the Event object that you can use for one thread to wait for sure until the other changes the value you are expecting:

config.py:

changed = Event()
changed.clear()

global_var = 5

module in worker thread:

import config

def do_things():
    while True:
        config.changed.wait()  # blocks until other thread sets the event
        do_more_things_with(config.global_var)

and on the main thread:

import config

config.global_var = 7
config.changed.set()  # FRees the waiting Thread to run

Note in the above code, I always refer to the objects in config with the dotted notation. That makes no difference for the "event" object - I could do from config import changed - since I am dealing with internal states of the same object, it would work - but if I do from config import global_var and reassign it with global_var = 7 , that only changes where the local_var name in the current module's context points. The config.local_var still references the original value.

And since you are at it, it is worth taking a look on the queue module , as well as on thread-local objects

When it still does not work

Another possibility for not seeing the changes is that, since the parallelism is not in your code, but in another library, it is spawning Processes with th e multiprocessing module instead of threads.

The problems you have if you were expecting Threads and having multiprocessing-spawned processes would be exactly what you describe: of changes to global variables not being visible in others (simply because each process has its own variables, of course).

If that is the case, it is possible to have (numeric, typed), objects that are synchronized across the processes. Check the Array and Value classes, and multiprocessing Queue to be able to send and receive (mostly) arbitrary objects.

(Add a import multiprocessing; print(multiprocessing.current_process()) line to your code to be sure. Independent of the result, please suggest the maintainers of RandomizedSearchCV documentation to mention explicitly what they are doing for parallelism)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM