简体   繁体   English

全局变量跨python中的模块和线程

[英]Global variables accross modules and threads in python

I have a config file config.py which holds a global variable, ie in config.py I have (5 is the default) 我有一个配置文件config.py,其中包含一个全局变量,即在config.py中,我有(默认值为5)

# config.py
globalVar = 5

Now in a module run.py I'm setting the global variable and then I call a printing function: 现在在模块run.py中,设置全局变量,然后调用打印函数:

# run.py
import config
import test
config.globalVar = 7
test.do_printing()

# test.py
import config
def do_printing():
  print(config.globalVar)

This works well (ie 7 is printed) but if I use multiple threads for printing (in test.py) it does not work anymore, ie then the threads do not see the change made by run.py (ie 5 is printed). 这很好用(即打印了7),但是如果我使用多个线程进行打印(在test.py中),它将不再起作用,即线程看不到run.py所做的更改(即打印了5)。

How can this be solved? 如何解决呢?

Even when running on the same thread you might have issues doing that. 即使在同一线程上运行,您在执行该操作时也可能会遇到问题。 For example, if you do from config import globalVar instead, if you rebind globalVar in the local module, it just looses the reference to the object in the config module. 例如,如果您是from config import globalVar ,那么,如果您在本地模块中重新绑定globalVar,它只会失去对config模块中对象的引用。

And even if you don't do that, if changes to the variable take place at import time of your various modules, it is very hard to keep track of the actual import order. 即使您不这样做,如果在各个模块的导入时对变量进行了更改,也很难跟踪实际的导入顺序。

When you add threads, that just becomes 100% unmanageable, due to all sorts of race conditions. 当添加线程时,由于各种竞争条件,这将变得100%无法管理。 Other than a race condition (ie one of your threads reads the variable before it has been set on the other thread), or incorrect importing, threads should not affect the visibility of global variable changes in the way you describe. 除了竞争条件(即,您的一个线程在另一个线程上设置变量之前先读取该变量)或错误导入之外,线程不应以您描述的方式影响全局变量更改的可见性。

The solution for having deterministic code is to use data structures that are appropriate for that interchange across threads (and data protection across threads). 具有确定性代码的解决方案是使用适合于跨线程交换(以及跨线程数据保护)的数据结构。

The threading module itself offers the Event object that you can use for one thread to wait for sure until the other changes the value you are expecting: threading模块本身提供了Event对象,您可以使用该对象来确保一个线程等待直到另一个线程更改了您期望的值:

config.py: config.py:

changed = Event()
changed.clear()

global_var = 5

module in worker thread: 工作线程中的模块:

import config

def do_things():
    while True:
        config.changed.wait()  # blocks until other thread sets the event
        do_more_things_with(config.global_var)

and on the main thread: 在主线程上:

import config

config.global_var = 7
config.changed.set()  # FRees the waiting Thread to run

Note in the above code, I always refer to the objects in config with the dotted notation. 请注意,在上面的代码中,我总是用点分符号引用config中的对象。 That makes no difference for the "event" object - I could do from config import changed - since I am dealing with internal states of the same object, it would work - but if I do from config import global_var and reassign it with global_var = 7 , that only changes where the local_var name in the current module's context points. 这对于“事件”对象没有什么区别-我可以from config import changed来做-因为我正在处理同一个对象的内部状态,所以它可以工作-但是如果我from config import global_varfrom config import global_var并用global_var = 7重新分配,仅更改当前模块上下文中local_var名称的位置。 The config.local_var still references the original value. config.local_var仍引用原始值。

And since you are at it, it is worth taking a look on the queue module , as well as on thread-local objects 而且由于您正在研究它,因此值得一看队列模块线程本地对象

When it still does not work 当它仍然不起作用时

Another possibility for not seeing the changes is that, since the parallelism is not in your code, but in another library, it is spawning Processes with th e multiprocessing module instead of threads. 无法看到更改的另一种可能性是,由于并行性不在您的代码中,而是在另一个库中,因此它使用multiprocessing模块而不是线程来生成进程。

The problems you have if you were expecting Threads and having multiprocessing-spawned processes would be exactly what you describe: of changes to global variables not being visible in others (simply because each process has its own variables, of course). 如果您期望线程并且拥有多进程生成的进程,那么您遇到的问题将正是您所描述的:全局变量的更改在其他变量中不可见(当然,这是因为每个进程都有自己的变量)。

If that is the case, it is possible to have (numeric, typed), objects that are synchronized across the processes. 在这种情况下,有可能具有跨进程同步的(数字,类型化的)对象。 Check the Array and Value classes, and multiprocessing Queue to be able to send and receive (mostly) arbitrary objects. 检查ArrayValue类,以及multiprocessing Queue ,使其能够发送和接收(主要是)任意对象。

(Add a import multiprocessing; print(multiprocessing.current_process()) line to your code to be sure. Independent of the result, please suggest the maintainers of RandomizedSearchCV documentation to mention explicitly what they are doing for parallelism) (确保在您的代码中添加一个import multiprocessing; print(multiprocessing.current_process())行。请确保不依赖于结果,建议RandomizedSearchCV文档的维护者明确提及他们为并行性所做的工作)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM