简体   繁体   English

Python模块初始化顺序?

[英]Python Module Initialization Order?

I am a Python newbie coming from a C++ background.我是来自 C++ 背景的 Python 新手。 While I know it's not Pythonic to try to find a matching concept using my old C++ knowledge, I think this question is still a general question to ask:虽然我知道尝试使用我的旧 C++ 知识找到匹配的概念不是 Pythonic,但我认为这个问题仍然是一个普遍的问题:

Under C++, there is a well known problem called global/static variable initialization order fiasco, due to C++'s inability to decide which global/static variable would be initialized first across compilation units, thus a global/static variable depending on another one in different compilation units might be initialized earlier than its dependency counterparts, and when dependant started to use the services provided by the dependency object, we would have undefined behavior.在 C++ 下,有一个众所周知的问题,称为全局/静态变量初始化顺序失败,由于 C++ 无法决定哪个全局/静态变量将首先跨编译单元初始化,因此全局/静态变量取决于不同编译中的另一个单元可能比它的依赖对应物更早初始化,并且当依赖开始使用依赖对象提供的服务时,我们将有未定义的行为。 Here I don't want to go too deep on how C++ solves this problem.在这里我不想太深入探讨 C++ 是如何解决这个问题的。 :) :)

On the Python world, I do see uses of global variables, even across different .py files, and one typycal usage case I saw was: initialize one global object in one .py file, and on other .py files, the code just fearlessly start using the global object, assuming that it must have been initialized somewhere else, which under C++ is definitely unaccept by myself, due to the problem I specified above.在 Python 世界中,我确实看到了全局变量的使用,即使在不同的 .py 文件中也是如此,我看到的一个典型用例是:在一个 .py 文件和其他 .py 文件中初始化一个全局对象,代码无所畏惧开始使用全局对象,假设它必须在其他地方初始化,由于我上面指定的问题,在C++下我绝对不能接受。

I am not sure if the above use case is common practice in Python (Pythonic), and how does Python solve this kind of global variable initialization order problem in general?我不确定上面的用例是否是Python(Pythonic)中的常见做法,Python一般如何解决这种全局变量初始化顺序问题?

Python import executes new Python modules from beginning to end. Python 导入从头到尾执行新的 Python 模块。 Subsequent imports only result in a copy of the existing reference in sys.modules , even if still in the middle of importing the module due to a circular import.后续导入只会在sys.modules中生成现有引用的副本,即使由于循环导入仍在导入模块的过程中。 Module attributes ("global variables" are actually at the module scope) that have been initialized before the circular import will exist.在循环导入存在之前已经初始化的模块属性(“全局变量”实际上在模块范围内)。

main.py : main.py

import a

a.py : a.py

var1 = 'foo'
import b
var2 = 'bar'

b.py : b.py

import a
print a.var1 # works
print a.var2 # fails

Under C++, there is a well known problem called global/static variable initialization order fiasco, due to C++'s inability to decide which global/static variable would be initialized first across compilation units,在 C++ 下,有一个众所周知的问题,称为全局/静态变量初始化顺序失败,因为 C++ 无法决定跨编译单元首先初始化哪个全局/静态变量,

I think that statement highlights a key difference between Python and C++: in Python, there is no such thing as different compilation units.我认为这句话突出了 Python 和 C++ 之间的一个关键区别:在 Python 中,没有不同的编译单元这样的东西。 What I mean by that is, in C++ (as you know), two different source files might be compiled completely independently from each other, and thus if you compare a line in file A and a line in file B, there is nothing to tell you which will get placed first in the program.我的意思是,在 C++ 中(如您所知),两个不同的源文件可能彼此完全独立地编译,因此如果您比较文件 A 中的一行和文件 B 中的一行,则没有什么可说的您将在程序中排名第一。 It's kind of like the situation with multiple threads: you cannot say whether a particular statement in thread 1 will be executed before or after a particular statement in thread 2. You could say C++ programs are compiled in parallel.这有点像多线程的情况:你不能说线程 1 中的特定语句是在线程 2 中的特定语句之前还是之后执行。你可以说 C++ 程序是并行编译的。

In contrast, in Python, execution begins at the top of one file and proceeds in a well-defined order through each statement in the file, branching out to other files at the points where they are imported.相比之下,在 Python 中,执行从一个文件的顶部开始,并通过文件中的每个语句以明确定义的顺序进行,在导入它们的位置分支到其他文件。 In fact, you could almost think of the import directive as an #include , and in that way you could identify the order of execution of all the lines of code in all the source files in the program.事实上,您几乎可以将import指令视为#include ,这样您就可以确定程序中所有源文件中所有代码行的执行顺序。 (Well, it's a little more complicated than that, since a module only really gets executed the first time it's imported, and for other reasons.) If C++ programs are compiled in parallel, Python programs are interpreted serially. (嗯,它比那要复杂一点,因为模块只有在第一次导入时才会真正执行,并且出于其他原因。)如果 C++ 程序是并行编译的,Python 程序将被串行解释。

Your question also touches on the deeper meaning of modules in Python.您的问题还涉及 Python 中模块的更深层含义。 A Python module - which is everything that is in a single .py file - is an actual object. Python 模块——它是单个.py文件中的所有内容——是一个实际的对象。 Everything declared at "global" scope in a single source file is actually an attribute of that module object.在单个源文件中在“全局”范围内声明的所有内容实际上都是该模块对象的属性。 There is no true global scope in Python. Python 中没有真正的全局作用域。 (Python programmers often say "global" and in fact there is a global keyword in the language, but it always really refers to the top level of the current module.) I could see that being a bit of a strange concept to get used to coming from a C++ background. (Python 程序员经常说“global”,实际上语言中有一个global关键字,但它总是指当前模块的顶层。)我可以看到这是一个有点奇怪的概念来习惯来自 C++ 背景。 It took some getting used to for me, coming from Java, and in this respect Java is a lot more similar to Python than C++ is.来自 Java 的我花了一些时间来适应,在这方面,Java 与 Python 的相似度远高于 C++。 (There is also no global scope in Java) (Java 中也没有全局作用域)

I will mention that in Python it is perfectly normal to use a variable without having any idea whether it has been initialized/defined or not.我会提到,在 Python 中使用一个变量而不知道它是否已经初始化/定义是完全正常的。 Well, maybe not normal, but at least acceptable under appropriate circumstances.嗯,也许不正常,但至少在适当的情况下是可以接受的。 In Python, trying to use an undefined variable raises a NameError ;在 Python 中,尝试使用未定义的变量会引发NameError you don't get arbitrary behavior as you might in C or C++, so you can easily handle the situation.您不会像在 C 或 C++ 中那样获得任意行为,因此您可以轻松处理这种情况。 You may see this pattern:您可能会看到这种模式:

try:
    duck.quack()
except NameError:
    pass

which does nothing if duck does not exist.如果duck不存在,则什么也不做。 Actually, what you'll more commonly see is实际上,你更常看到的是

try:
    duck.quack()
except AttributeError:
    pass

which does nothing if duck does not have a method named quack .如果duck没有名为quack的方法,则什么也不做。 ( AttributeError is the kind of error you get when you try to access an attribute of an object, but the object does not have any attribute by that name.) This is what passes for a type check in Python: we figure that if all we need the duck to do is quack, we can just ask it to quack, and if it does, we don't care whether it's really a duck or not. AttributeError是当您尝试访问对象的属性时遇到的那种错误,但该对象没有该名称的任何属性。)这就是 Python 中类型检查的通过:我们认为如果我们需要鸭子做的是嘎嘎,我们可以让它嘎嘎,如果是,我们不在乎它是否真的是一只鸭子。 (It's called duck typing ;-) (这叫做鸭子打字;-)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM