How to copy a file from Azure Storage to a VM in agent pool in a pipeline?

Question

So, I'm setting up a pipeline where an agent pool assigns a virtual machine (VM) to do the work. One of the steps requires a large file to be used. That file is in Azure Storage (storage account.)

I'd like to be able to access the file through an azure service and not through the internet (if that makes sense?) For example, I don't want to provide a SAS token in the url and get it using AzureCLI.I want to be able to define the task in the pipeline using AzureFileCopy.

I read the docs over here but I don't know what I should put for sourcepath . Is that the full URL for the file I need? Also, the VM is assigned from the pool. I don't have a vmsAdminUserName or vmsAdminPassword to it. If I select destination to be a azureVMs , vmsAdminUserName or vmsAdminPassword are mandatory fields.

I just want to have a file available on the VM the pool assigns to my pipeline!

Answer 1

I think you misunderstand the usage od 'Azure File Copy'.

Your requirement is get big file from azure storage.

But 'Azure File Copy' is used for 'Copy Files to Azure Blob Storage or virtual machines'.

No built-in structure to achieve your requirements.

I write a code which can meet your requirements, you just need to pass required variables is ok(The underlying principle of this script is the same as the 'Azure File Copy task', which interacts with Azure Storage through the service principal.).

my pipeline:

trigger:
- none

pool:
  name: VMAS #This is my agent pool, and this YAML is tested on Windows VM

steps:
- task: PythonScript@0
  inputs:
    scriptSource: 'inline'
    script: |
      import requests
      import os
      from urllib.parse import urlencode
      
      grant_type = "client_credentials"
      client_id = "$(client_id)"
      client_secret = "$(client_secret)"
      resource = "https://storage.azure.com/"
      tenant_id = "$(tenant_id)"
      
      blobstorage_account_name = "<Your Storage account name>"
      blobstorage_container_name = "<Your Blob container Name>"
      blob_name = "SomeFiles/Folder1/Folder2/test.csv" #This is my blob name.
      
      keepstructure = True #Define whether keep the folder structure.
      
      def getazuretoken(grant_type,client_id,client_secret,resource):
          url = "https://login.microsoftonline.com/"+tenant_id+"/oauth2/token"
      
          payload='grant_type='+grant_type+'&client_id='+client_id+'&client_secret='+client_secret+'&resource='+resource
          headers = {
          'Content-Type': 'application/x-www-form-urlencoded'
          }
      
          response = requests.request("POST", url, headers=headers, data=payload)
          #get access token
          access_token = response.json()['access_token']
          return access_token
      
      
      token = getazuretoken(grant_type,client_id,client_secret,resource)
      
      
      #get azure blob using service principal
      def downloadazureblob(blob_name,keepstructure):
          
          blob_url = "https://"+blobstorage_account_name+".blob.core.windows.net/"+blobstorage_container_name+"/"+blob_name
          blob_headers = {
              'Content-Type': 'application/x-www-form-urlencoded'
          }
          blob_payload='Authorization=Bearer%20'+token
          
          blob_response = requests.request("GET", blob_url, headers=blob_headers,data=blob_payload)
          if keepstructure == True: #Save blob with folder structure
              #get folder structure
              folder_structure = blob_name.split('/')[:-1]
              #folder_structure to string
              folder_structure_string = '\\'+'\\'.join(folder_structure)+'\\'
              #for loop to create folder and file
              for i in range(blob_name.count('/')):
                  #split blob_name to get folder name
                  folder_name = blob_name.split('/')[i]
                  #create folder
                  os.mkdir(folder_name)
                  #move to folder
                  os.chdir(folder_name)
              #split blob_name to get file name
              file_name = blob_name.split('/')[blob_name.count('/')]
              cur_path = "$(Build.SourcesDirectory)"
              file_path = os.path.join(cur_path+folder_structure_string, file_name)
              #create file under folder_structure_string
              with open(file_path, 'wb') as f:
                  f.write(blob_response.content)
          elif keepstructure == False: #Save blob without folder structure
              #split blob_name to get file name
              file_name = blob_name.split('/')[-1]
              with open(file_name, 'wb') as f:
                  f.write(blob_response.content)
          else:
              print("Please input True or False")
          return blob_response.text
      
      
      downloadazureblob(blob_name,keepstructure)
- task: ArchiveFiles@2
  inputs:
    rootFolderOrFile: '$(System.DefaultWorkingDirectory)'
    includeRootFolder: true
    archiveType: 'zip'
    archiveFile: '$(Build.ArtifactStagingDirectory)/$(Build.BuildId).zip'
    replaceExistingArchive: true

For secure reasons, please store the client_id, client-secret and tenant_id in the pipeline definition variables and secure them:

This is the original blob:

Successfully to Copy the files to VM:

How to copy a file from Azure Storage to a VM in agent pool in a pipeline?

Question

1 answers

solution1
0 2022-08-24 08:40:57

How to copy a file from Azure Storage to a VM in agent pool in a pipeline?

Question

1 answers

solution1 0 2022-08-24 08:40:57

solution1
0 2022-08-24 08:40:57