简体   繁体   English

python路径和导入顺序

[英]python paths and import order

I really want to get this right because I keep running into it when generating some big py2app/py2exe packages.我真的很想把这个做对,因为我在生成一些大的 py2app/py2exe 包时不断遇到它。 I have my package that contains a lot of modules/packages that might also be in the users site packages/default location (if a user has a python distribution) but I want my distributed packages to take effect before them when running from my distribution.我的包中包含许多模块/包,这些模块/包也可能位于用户站点包/默认位置(如果用户有 python 发行版),但我希望我的分发包在从我的发行版运行时在它们之前生效。

Now from what I've read here PYTHONPATH should be the first thing added to sys.path after the current directory, however from what I've tested on my machine that is not the case and all the folders defined in $site-packages$/easy-install.pth take precedence over this.现在从我在这里读到的内容PYTHONPATH应该是在当前目录之后添加到sys.path的第一件事,但是从我在我的机器上测试的情况来看,情况并非如此,并且$site-packages$/easy-install.pth定义的所有文件夹$site-packages$/easy-install.pth优先于此。

Could someone please give me some more in-depth explanation about this import order and help me find a way to set the environmental variables in such a way that the packages I distribute take precedence over the default installed ones?有人可以给我一些有关此导入顺序的更深入的解释,并帮助我找到一种设置环境变量的方法,以使我分发的软件包优先于默认安装的软件包吗?

So far my attempt is, for example on Mac-OS py2app, in my entry point script:到目前为止,我的尝试是,例如在 Mac-OS py2app 上,在我的入口点脚本中:

 os.environ['PYTHONPATH'] = DATA_PATH + ':'
 os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(DATA_PATH
                                                            , 'lib') + ':'
 os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(
                                DATA_PATH, 'lib', 'python2.7', 'site-packages') + ':'
 os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(
                          DATA_PATH, 'lib', 'python2.7', 'site-packages.zip')

This is basically the structure of the package generated by py2app.这基本上就是py2app生成的包的结构。 Then I just:然后我只是:

 SERVER = subprocess.Popen([PYTHON_EXE_PATH, '-m', 'bin.rpserver'
                            , cfg.RPC_SERVER_IP, cfg.RPC_SERVER_PORT],
                            shell=False, stdin=IN_FILE, stdout=OUT_FILE, 
                            stderr=ERR_FILE)

Here PYTHON_EXE_PATH is the path to the python executable that is added by py2app to the package.这里PYTHON_EXE_PATH是 py2app 添加到包中的 python 可执行文件的路径。 This works fine on a machine that doesn't have a python installed.这在没有安装 python 的机器上工作正常。 However, when python distribution is already present, its site-packages take precedence.但是,当 python 发行版已经存在时,它的站点包优先。

Python searches the paths in sys.path in order (see http://docs.python.org/tutorial/modules.html#the-module-search-path ). Python 按顺序搜索sys.path中的路径(请参阅http://docs.python.org/tutorial/modules.html#the-module-search-path )。 easy_install changes this list directly (see the last line in your easy-install.pth file): easy_install 直接更改此列表(请参阅 easy-install.pth 文件中的最后一行):

import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

This basically takes whatever directories are added and inserts them at the beginning of the list.这基本上采用添加的任何目录并将它们插入到列表的开头。

Also see Eggs in path before PYTHONPATH environment variable .另请参阅PYTHONPATH 环境变量之前的路径中的 Eggs

This page is a high Google result for "Python import order", so here's a hopefully clearer explanation:这个页面是“Python 导入顺序”的高谷歌搜索结果,所以这里有一个希望更清晰的解释:

As both of those pages explain, the import order is:正如这两个页面所解释的, import顺序是:

  1. Built-in python modules.内置python模块。 You can see the list in the variable sys.modules .您可以在变量sys.modules看到该列表。
  2. The sys.path entries. sys.path条目。
  3. The installation-dependent default locations.依赖于安装的默认位置。

And as the sys.path doc page explains, it is populated as follows:正如sys.path文档页面所解释的,它的填充如下:

  1. The first entry is the FULL PATH TO THE DIRECTORY of the file which python was started with (so /someplace/on/disk/> $ python /path/to/the/run.py means the first path is /path/to/the/ , and likewise the path would be the same if you're in /path/to/> $ python the/run.py (it is still ALWAYS going to be set to the FULL PATH to the directory no matter if you gave python a relative or absolute file)), or it will be an empty string if python was started without a file aka interactive mode (an empty string means "current working directory for the python process").第一个条目是启动python的文件目录的完整路径(所以/someplace/on/disk/> $ python /path/to/the/run.py表示第一个路径是/path/to/the/ ,同样,如果您在/path/to/> $ python the/run.py ,路径也是相同的python 一个相对或绝对文件)),或者如果 python 在没有文件即交互模式的情况下启动,它将是一个空字符串(空字符串表示“python进程的当前工作目录”)。 In other words, Python assumes that the file you started wants to be able to do relative imports of package/-folders and blah.py modules that exist within the same location as the file you started python with.换句话说,Python 假设您启动的文件希望能够对package/-foldersblah.py模块进行相对导入, blah.py模块与您启动 python 的文件位于同一位置。
  2. The other entries in sys.path are populated from the PYTHONPATH environment variable. sys.path中的其他条目由PYTHONPATH环境变量填充。 Basically your global pip folders where your third-party python packages are installed (things like requests and numpy and tensorflow ).基本上是安装了第三方 python 包的全局 pip 文件夹(比如requestsnumpytensorflow )。

So, basically: Yes, you can trust that Python will find your local package-folders and module files first, before any globally installed pip stuff.所以,基本上:是的,你可以相信 Python 会在任何全局安装的 pip 东西之前首先找到你的本地包文件夹和模块文件。

Here's an example to explain further:这是一个进一步解释的例子:

myproject/ # <-- This is not a package (no __init__.py file).
  modules/ # <-- This is a package (has an __init__.py file).
    __init__.py
    foo.py
  run.py
  second.py

executed with: python /path/to/the/myproject/run.py
will cause sys.path[0] to be "/path/to/the/myproject/"

run.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
import second # will import "/path/to/the/myproject/" + "second.py"

second.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"

EDIT:编辑:

You can run the following command to print a sorted list of all built-in module names.您可以运行以下命令来打印所有内置模块名称的排序列表。 These are the things that load before ANY custom files/module folders in your projects.这些是在项目中的任何自定义文件/模块文件夹之前加载的内容。 Basically these are names you must avoid in your own custom files:基本上,这些是您在自己的自定义文件中必须避免的名称:

python -c "import sys, json; print(json.dumps(sorted(list(sys.modules.keys())), indent=4))"

List as of Python 3.9.0:从 Python 3.9.0 开始列出:

"__main__",
"_abc",
"_bootlocale",
"_codecs",
"_collections",
"_collections_abc",
"_frozen_importlib",
"_frozen_importlib_external",
"_functools",
"_heapq",
"_imp",
"_io",
"_json",
"_locale",
"_operator",
"_signal",
"_sitebuiltins",
"_sre",
"_stat",
"_thread",
"_warnings",
"_weakref",
"abc",
"builtins",
"codecs",
"collections",
"copyreg",
"encodings",
"encodings.aliases",
"encodings.cp1252",
"encodings.latin_1",
"encodings.utf_8",
"enum",
"functools",
"genericpath",
"heapq",
"io",
"itertools",
"json",
"json.decoder",
"json.encoder",
"json.scanner",
"keyword",
"marshal",
"nt",
"ntpath",
"operator",
"os",
"os.path",
"pywin32_bootstrap",
"re",
"reprlib",
"site",
"sre_compile",
"sre_constants",
"sre_parse",
"stat",
"sys",
"time",
"types",
"winreg",
"zipimport"

So NEVER use any of those names for you .py files or your project module subfolders.所以永远不要为你的.py文件或你的项目模块子文件夹使用任何这些名称。

Even though the above answers regarding the order in which the interpreter scans sys.path are correct, giving precedence to eg user file paths over site-packages deployed packages might fail if the full user path is not available in the PYTHONPATH variable.即使上述关于解释器扫描sys.path的顺序的答案是正确的,但如果PYTHONPATH变量中没有完整的用户路径,则优先考虑用户文件路径而不是site-packages部署的包可能会失败。

For example, imagine you have the following structure of namespace packages:例如,假设您有以下命名空间包结构:

/opt/repo_root
  - project  # this is the base package that brigns structure to the namespace hierarchy
  - my_pkg
  - my_pkg-core
  - my_pkg-gui
  - my_pkg-helpers
  - my_pkg-helpers-time_sync

The above packages all have the internal needed structure and metadata in order to be deployable by conda, and these are also all installed.上述包都具有内部所需的结构和元数据,以便可以被 conda 部署,并且这些也都已安装。 Therefore, I can open a python shell and type:因此,我可以打开一个 python shell 并输入:

>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)

/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py

will return some path in the python interpreter's site-packages subfolder.将返回 python 解释器的site-packages子文件夹中的一些路径。 If I manually add the package to be imported to PYTHONPATH or even to sys.path , nothing will change.如果我手动添加要导入到PYTHONPATH甚至到sys.path的包, PYTHONPATH会有任何改变。

>>> import os

>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(os.environ['PYTHONPATH'], "/opt/repo_root/my_pkg-helpers-time_sync")

>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)

/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py

still returns that the package has been imported from site-packages .仍然返回该包已从site-packages导入。 You need to include the whole hierarchy of paths into PYTHONPATH , as if it was a traditional python package, and then it will work as you expect:您需要将整个路径层次结构包含到PYTHONPATH ,就好像它是一个传统的 python 包一样,然后它将按您的预期工作:

>>> import os

>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(
... os.environ['PYTHONPATH'],
... "/opt/repo_root",
... "/opt/repo_root/project",
... "/opt/repo_root/project/my_pkg",
... "/opt/repo_root/project/my_pkg-helpers",
... "/opt/repo_root/project/my_pkg-helpers-time_sync"
... )

>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)

/opt/project/my_pkg/helpers/time_sync/__init__.py

after importing a module, python first searches from sys.modules list of directories.导入模块后,python 首先从sys.modules目录列表中搜索。 if it is not found, then it searches from sys.path list of directories.如果未找到,则从sys.path目录列表中搜索。 There might be other lists python search for on your operating system在您的操作系统上可能还有其他列表 python 搜索

import time , sys
print (sys.modules)
print (sys.path)

output is lists of directories:输出是目录列表:

{... , ... , .....}
['C:\\Users\\****', 'C:\\****', ....']

time module is imported in accordance with the order of sys.modules and sys.path lists. time模块按照sys.modulessys.path列表的顺序导入。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM