简体   繁体   English

python pip 具有 index-url 和 extra-index-url 的优先顺序

[英]python pip priority order with index-url and extra-index-url

I searched a bit but could not find a clear answer.我搜索了一下,但找不到明确的答案。
The goal is, to have two pip indexes, one is a private index, that will be a first priority.目标是拥有两个 pip 索引,一个是私有索引,这将是第一要务。 And one is the standard PyPI.一个是标准的 PyPI。 The priority is there to prevent the security risk of code injection.优先级是防止代码注入的安全风险。

Say I have library named lib , and I configure index_url = http://my_private_pypi_repo and extra_index_url = https://pypi.org/simple假设我有名为lib的库,并且我配置index_url = http://my_private_pypi_repoextra_index_url = https://pypi.org/simple

If I pip install lib , and lib exists in both indexes.如果我pip install lib ,并且lib存在于两个索引中。 What index will get the priority?什么索引将获得优先权? From where it is going to be installed from?从哪里安装?

Also, if I pip install lib=0.0.2 but lib exists in my private index at version 0.0.1.此外,如果我pip install lib=0.0.2但 lib 存在于我的私有索引中,版本为 0.0.1。 Is it going to look at PyPI as well?它是否也会考虑 PyPI?

And what is a good way to be in control, that certain libraries will only be fetched from the private index if they exists there, and will not be looked for at PyPI?什么是控制的好方法,如果某些库存在于私有索引中,它们只会从私有索引中获取,并且不会在 PyPI 中查找?

The short answer is: there is no prioritization and you probably should avoid using --extra-index-url entirely.简短的回答是:没有优先级,您可能应该完全避免使用--extra-index-url


This is asked and answered here: https://github.com/pypa/pip/issues/5045#issuecomment-369521345在这里询问和回答: https://github.com/pypa/pip/issues/5045#issuecomment-369521345

Question :问题

I have this in my pip.conf:我的 pip.conf 中有这个:

 [global] index-url = https://myregistry-xyz.com extra-index-url = https://pypi.python.org/pypi

Let's assume packageX exists in both registries and I run pip install packageX.假设 packageX 在两个注册表中都存在,我运行 pip 安装 packageX。

I expect pip to install packageX from https://myregistry-xyz.com , but pip will use https://pypi.python.org/pypi instead. I expect pip to install packageX from https://myregistry-xyz.com , but pip will use https://pypi.python.org/pypi instead.

If I switch the values for index-url and extra-index-url I get the same result.如果我切换 index-url 和 extra-index-url 的值,我会得到相同的结果。 pypi is always prioritized. pypi 总是优先的。

Answer :答案

Packages are expected to be unique up to name and version, so two wheels with the same package name and version are treated as indistinguishable by pip.包在名称和版本上应该是唯一的,因此 package 名称和版本相同的两个轮子被 pip 视为无法区分。 This is a deliberate feature of the package metadata, and not likely to change.这是 package 元数据的故意功能,不太可能更改。


I would also recommend reading this discussion: https://discuss.python.org/t/dependency-notation-including-the-index-url/5659我还建议阅读此讨论: https://discuss.python.org/t/dependency-notation-including-the-index-url/5659

There are quite a lot of things that are addressed in this discussion, some that is clearly out of scope for this question, but everything is very informative anyway.这个讨论中有很多事情要解决,有些事情显然超出了 scope 这个问题,但无论如何,一切都非常有用。

In there, there should be the key takeaway for you:在那里,应该有关键的要点:

Pip does not really prioritize one index over the other in theory. Pip 理论上并没有真正优先考虑一个索引。 In practice, because of a coincidence in the way things are implemented in code, it might be that one is always checked first, but it is not a behavior you should rely on.在实践中,由于代码实现方式的巧合,可能总是首先检查一个,但这不是您应该依赖的行为。

And what is a good way to be in control, that certain libraries will only be fetched from the private index if they exists there, and will not be looked for at PyPI?什么是控制的好方法,如果某些库存在于私有索引中,它们只会从私有索引中获取,并且不会在 PyPI 中查找?

You should setup and curate your own package index (devpi, pydist, jfrog artifactory, sonatype nexus, etc.) and use it exclusively, meaning: never use --extra-index-url .您应该设置和管理您自己的 package 索引(devpi、pydist、jfrog artifactory、sonatype nexus 等)并专门使用它,这意味着:永远不要使用--extra-index-url This is the only way you can have exact control over what gets downloaded.这是您可以精确控制下载内容的唯一方法。 This custom repository might function mostly a proxy for the public PyPI, except for a couple of dependencies.这个自定义存储库可能 function 主要是公共 PyPI 的代理,除了几个依赖项。


Related :相关

The title of this question feels a bit like an instance of XY problem .这个问题的题目感觉有点像XY问题的一个实例。 If you would elaborate more on what you want to achieve and what your constraints are we may be able to give you a better answer.如果您能详细说明您想要实现的目标以及您的限制是什么,我们或许可以为您提供更好的答案。

That said, sinoroc's suggestion to curate your own package index and use only that is a good one.也就是说,sinoroc 建议管理您自己的 package 索引并仅使用它是一个很好的索引。 A few other ideas also come to mind:其他一些想法也浮现在脑海中:

  • Update : Turns out pip may run distributions other than those in the constraints file so this method should probably be considered insecure.更新:原来pip 可能运行约束文件中的分布以外的分布,因此这种方法可能应该被认为是不安全的。 Additionally hashes are kind of broken on recent releases of pip.此外,在最近发布的 pip中,哈希值有些破损

    Using a constraints file with hashes.使用带有哈希的约束文件。 This file can be generated using pip-tools like pip-compile --generate-hashes assuming you have documented your dependencies in a file named requirements.in .假设您已将依赖项记录在名为requirements.in的文件中,则可以使用 pip pip-compile --generate-hashes pip-tools生成此文件。 You can then install packages like pip install -c requirements.txt some_package .然后,您可以安装pip install -c requirements.txt some_package类的软件包。

    • Pro: What may be installed is documented alongside your code in your VCS.优点:可能安装的内容与您的代码一起记录在您的 VCS 中。
    • Con: Controlling what is downloaded the first time is either tricky or laborious.缺点:控制第一次下载的内容要么很棘手,要么很费力。
    • Con: Hash checking can be slow.缺点:Hash 检查可能很慢。
    • Con: You run into issues more frequently than when not using hashes.缺点:与不使用哈希相比,您遇到问题的频率更高。 Some can be worked around others cannot;有些可以解决,有些则不能; it is for instance not possible to combine constraints like -e file://` with hashes.例如,不可能将 -e file://` 之类的约束与散列结合起来。
  • Use an alternative packaging tool like pipenv.使用像 pipenv 这样的替代打包工具。 It works similarly to the previous suggestion.它的工作原理与前面的建议类似。

    • Pro: Easy to use优点:易于使用
    • Con: Harder to integrate into your workflow if it does not fit naturally.缺点:如果不能自然地融入您的工作流程,则更难集成。
  • Curate packages locally.在本地管理包。 Packages and dependencies can be downloaded like pip download --dest some_dir some_package and installed like pip install --no-index --find-links some_dir .可以像pip download --dest some_dir some_package一样下载包和依赖项,并像pip install --no-index --find-links some_dir一样安装。

    • Pro: What may be installed can be documented alongside your code, if you track the artifacts in VCS eg git lfs.亲:如果您在 VCS 中跟踪工件,例如 git lfs,则可以将可能安装的内容与您的代码一起记录。
    • Con: Either all packages are downloaded or none are.缺点:要么下载所有包,要么不下载。
  • Use a hermetic build system.使用密封构建系统。 I know bazel advertise this as a feature, not sure about others like pants and buck.我知道 bazel 将此作为一项功能进行宣传,但不确定其他人,例如裤子和巴克。

    • Pro: May be the ultimate solution if you want control over your builds. Pro:如果你想控制你的构建,这可能是最终的解决方案。
    • Con: Does not integrate well with open source python ecosystem afaik.缺点:不能很好地与开源 python 生态系统 afaik 集成。
    • Con: A lot of overhead.缺点:很多开销。

1 : https://en.wikipedia.org/wiki/XY_proble 1https://en.wikipedia.org/wiki/XY_proble

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM