簡體   English   中英

Azure Artifacts 源上 Python 包的最佳實踐

[英]Best practice for Python package on Azure Artifacts feed

我開發了一些 Python 包,這些包已通過 DevOps 管道上傳到 Azure DevOps Artifacts。 它運行良好,但管道上的工件不僅存儲我的包,還存儲它們對 setup.cfg 文件的依賴關系!

它們是正常的依賴項、pandas 和類似的,但是將這些庫的副本存儲在 Artifacts 上是否是最佳實踐? 對於我的邏輯,我會說不......我怎樣才能防止這種行為?

這些是我的管道和我的 cfg 文件:

管道

trigger:
  tags:
    include:
      - 'v*.*'
  branches:
    include: 
    - main
    - dev-release

pool:
  vmImage: 'ubuntu-latest'

stages:
  - stage: 'Stage_Test'
    variables:
    - group: UtilsDev
    jobs:
    - job: 'Job_Test'
      steps:
      - task: UsePythonVersion@0
        inputs:
          versionSpec: '$(pythonVersion)'
        displayName: 'Use Python $(pythonVersion)'

      - script: |
          python -m pip install --upgrade pip
        displayName: 'Upgrade PIP'

      - script: |
          pip install pytest pytest-azurepipelines
        displayName: 'Install test dependencies'

      - script: |
          pytest
        displayName: 'Execution of PyTest'

  - stage: 'Stage_Build'
    variables:
    - group: UtilsDev
    jobs:
    - job: 'Job_Build'
      steps:
        - task: UsePythonVersion@0
          inputs:
            versionSpec: '$(pythonVersion)'
          displayName: 'Use Python $(pythonVersion)'

        - script: |
            python -m pip install --upgrade pip
          displayName: 'Upgrade PIP'

        - script: |
            pip install build wheel
          displayName: 'Install build dependencies'

        - script: |
            python -m build
          displayName: 'Artifact creation'

        - publish: '$(System.DefaultWorkingDirectory)'
          artifact: package

  - stage: 'Stage_Deploy_DEV'
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/dev-release'))
    variables:
    - group: UtilsDev
    jobs:
    - deployment: Build_Deploy
      displayName: Build Deploy
      environment: [OMIT]-artifacts-dev
      strategy:
        runOnce:
          deploy:
            steps:
            - download: current
              artifact: package

            - task: UsePythonVersion@0
              inputs:
                versionSpec: '$(pythonVersion)'
              displayName: 'Use Python $(pythonVersion)'

            - script: |
                pip install twine
              displayName: 'Install build dependencies'

            - task: TwineAuthenticate@1
              displayName: 'Twine authentication'
              inputs:
                pythonUploadServiceConnection: 'PythonPackageUploadDEV'

            - script: |
                python -m twine upload --skip-existing --verbose -r $(feedName) --config-file  $(PYPIRC_PATH) dist/*
              workingDirectory: '$(Pipeline.Workspace)/package'              
              displayName: 'Artifact upload'

  - stage: 'Stage_Deploy_PROD'
    dependsOn: 'Stage_Build'
    condition: and(succeeded(), or(eq(variables['Build.SourceBranch'], 'refs/heads/main'), startsWith(variables['Build.SourceBranch'], 'refs/tags/v')))
    variables:
    - group: UtilsProd
    jobs:
    - job: 'Approval_PROD_Release'
      pool: server
      steps:
      - task: ManualValidation@0
        timeoutInMinutes: 1440 # task times out in 1 day
        inputs:
          notifyUsers: |
            [USER]@[OMIT].com
          instructions: 'Please validate the build configuration and resume'
          onTimeout: 'resume'
    - deployment: Build_Deploy
      displayName: Build Deploy
      environment: [OMIT]-artifacts-prod
      strategy:
        runOnce:
          deploy:
            steps:
            - download: current
              artifact: package

            - task: UsePythonVersion@0
              inputs:
                versionSpec: '$(pythonVersion)'
              displayName: 'Use Python $(pythonVersion)'

            - script: |
                pip install twine
              displayName: 'Install build dependencies'

            - task: TwineAuthenticate@1
              displayName: 'Twine authentication'
              inputs:
                pythonUploadServiceConnection: 'PythonPackageUploadPROD'

            - script: |
                python -m twine upload --skip-existing --verbose -r $(feedName) --config-file  $(PYPIRC_PATH) dist/*
              workingDirectory: '$(Pipeline.Workspace)/package'    
              displayName: 'Artifact upload'

安裝文件

[metadata]
name = [OMIT]_azure
version = 0.2
author = [USER]
author_email = [USER]@[OMIT].com
description = A package containing utilities for interacting with Azure
long_description = file: README.md
long_description_content_type = text/markdown
project_urls =
classifiers =
    Programming Language :: Python :: 3
    License :: OSI Approved :: MIT License
    Operating System :: OS Independent

[options]
package_dir =
    = src
packages = find:
python_requires = >=3.7
install_requires =
    azure-storage-file-datalake>="12.6.0"
    pyspark>="3.2.1"
    openpyxl>="3.0.9"
    pandas>="1.4.2"
    pyarrow>="8.0.0"
    fsspec>="2022.3.0"
    adlfs>="2022.4.0"
    [OMIT]-utils>="0.4"

[options.packages.find]
where = src

我注意到管道僅在生產階段 (Stage_Deploy_PROD) 而不是在開發版本 (Stage_Deploy_DEV) 中具有這種行為,並且存儲的依賴項遠多於 setup.cfg 文件中指定的 8 個。

有沒有人處理過這個問題?

提前致謝!!

根據此文檔,啟用上游源后,每次從公共注冊表安裝包時,Azure Artifacts 都會在您的提要中保存該包的副本。

Artifact 中的包比你的 setup.cfg 文件中的包多的原因之一是,當你下載一些包時,這些包的必要依賴項也會一起下載。 PySpark為例,下載 PySpark 時,由於需要 Py4J,所以也會一起下載。 在此處輸入圖像描述

這是我的測試結果,當我只在管道中下載 PySpark 時,Py4J 也被下載並保存到 Artifact 中。 在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM