繁体   English   中英

Azure Artifacts 源上 Python 包的最佳实践

[英]Best practice for Python package on Azure Artifacts feed

我开发了一些 Python 包,这些包已通过 DevOps 管道上传到 Azure DevOps Artifacts。 它运行良好,但管道上的工件不仅存储我的包,还存储它们对 setup.cfg 文件的依赖关系!

它们是正常的依赖项、pandas 和类似的,但是将这些库的副本存储在 Artifacts 上是否是最佳实践? 对于我的逻辑,我会说不......我怎样才能防止这种行为?

这些是我的管道和我的 cfg 文件:

管道

trigger:
  tags:
    include:
      - 'v*.*'
  branches:
    include: 
    - main
    - dev-release

pool:
  vmImage: 'ubuntu-latest'

stages:
  - stage: 'Stage_Test'
    variables:
    - group: UtilsDev
    jobs:
    - job: 'Job_Test'
      steps:
      - task: UsePythonVersion@0
        inputs:
          versionSpec: '$(pythonVersion)'
        displayName: 'Use Python $(pythonVersion)'

      - script: |
          python -m pip install --upgrade pip
        displayName: 'Upgrade PIP'

      - script: |
          pip install pytest pytest-azurepipelines
        displayName: 'Install test dependencies'

      - script: |
          pytest
        displayName: 'Execution of PyTest'

  - stage: 'Stage_Build'
    variables:
    - group: UtilsDev
    jobs:
    - job: 'Job_Build'
      steps:
        - task: UsePythonVersion@0
          inputs:
            versionSpec: '$(pythonVersion)'
          displayName: 'Use Python $(pythonVersion)'

        - script: |
            python -m pip install --upgrade pip
          displayName: 'Upgrade PIP'

        - script: |
            pip install build wheel
          displayName: 'Install build dependencies'

        - script: |
            python -m build
          displayName: 'Artifact creation'

        - publish: '$(System.DefaultWorkingDirectory)'
          artifact: package

  - stage: 'Stage_Deploy_DEV'
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/dev-release'))
    variables:
    - group: UtilsDev
    jobs:
    - deployment: Build_Deploy
      displayName: Build Deploy
      environment: [OMIT]-artifacts-dev
      strategy:
        runOnce:
          deploy:
            steps:
            - download: current
              artifact: package

            - task: UsePythonVersion@0
              inputs:
                versionSpec: '$(pythonVersion)'
              displayName: 'Use Python $(pythonVersion)'

            - script: |
                pip install twine
              displayName: 'Install build dependencies'

            - task: TwineAuthenticate@1
              displayName: 'Twine authentication'
              inputs:
                pythonUploadServiceConnection: 'PythonPackageUploadDEV'

            - script: |
                python -m twine upload --skip-existing --verbose -r $(feedName) --config-file  $(PYPIRC_PATH) dist/*
              workingDirectory: '$(Pipeline.Workspace)/package'              
              displayName: 'Artifact upload'

  - stage: 'Stage_Deploy_PROD'
    dependsOn: 'Stage_Build'
    condition: and(succeeded(), or(eq(variables['Build.SourceBranch'], 'refs/heads/main'), startsWith(variables['Build.SourceBranch'], 'refs/tags/v')))
    variables:
    - group: UtilsProd
    jobs:
    - job: 'Approval_PROD_Release'
      pool: server
      steps:
      - task: ManualValidation@0
        timeoutInMinutes: 1440 # task times out in 1 day
        inputs:
          notifyUsers: |
            [USER]@[OMIT].com
          instructions: 'Please validate the build configuration and resume'
          onTimeout: 'resume'
    - deployment: Build_Deploy
      displayName: Build Deploy
      environment: [OMIT]-artifacts-prod
      strategy:
        runOnce:
          deploy:
            steps:
            - download: current
              artifact: package

            - task: UsePythonVersion@0
              inputs:
                versionSpec: '$(pythonVersion)'
              displayName: 'Use Python $(pythonVersion)'

            - script: |
                pip install twine
              displayName: 'Install build dependencies'

            - task: TwineAuthenticate@1
              displayName: 'Twine authentication'
              inputs:
                pythonUploadServiceConnection: 'PythonPackageUploadPROD'

            - script: |
                python -m twine upload --skip-existing --verbose -r $(feedName) --config-file  $(PYPIRC_PATH) dist/*
              workingDirectory: '$(Pipeline.Workspace)/package'    
              displayName: 'Artifact upload'

安装文件

[metadata]
name = [OMIT]_azure
version = 0.2
author = [USER]
author_email = [USER]@[OMIT].com
description = A package containing utilities for interacting with Azure
long_description = file: README.md
long_description_content_type = text/markdown
project_urls =
classifiers =
    Programming Language :: Python :: 3
    License :: OSI Approved :: MIT License
    Operating System :: OS Independent

[options]
package_dir =
    = src
packages = find:
python_requires = >=3.7
install_requires =
    azure-storage-file-datalake>="12.6.0"
    pyspark>="3.2.1"
    openpyxl>="3.0.9"
    pandas>="1.4.2"
    pyarrow>="8.0.0"
    fsspec>="2022.3.0"
    adlfs>="2022.4.0"
    [OMIT]-utils>="0.4"

[options.packages.find]
where = src

我注意到管道仅在生产阶段 (Stage_Deploy_PROD) 而不是在开发版本 (Stage_Deploy_DEV) 中具有这种行为,并且存储的依赖项远多于 setup.cfg 文件中指定的 8 个。

有没有人处理过这个问题?

提前致谢!!

根据此文档,启用上游源后,每次从公共注册表安装包时,Azure Artifacts 都会在您的提要中保存该包的副本。

Artifact 中的包比你的 setup.cfg 文件中的包多的原因之一是,当你下载一些包时,这些包的必要依赖项也会一起下载。 PySpark为例,下载 PySpark 时,由于需要 Py4J,所以也会一起下载。 在此处输入图像描述

这是我的测试结果,当我只在管道中下载 PySpark 时,Py4J 也被下载并保存到 Artifact 中。 在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM