简体   繁体   中英

Azure Data Factory Pipelines: Creating pipelines with Python: Authentication (via az cli)

I'm trying to create azure data factory pipelines via python, using the example provided by Microsoft here:

https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-python

def main():

    # Azure subscription ID
    subscription_id = '<Specify your Azure Subscription ID>'

    # This program creates this resource group. If it's an existing resource group, comment out the code that creates the resource group
    rg_name = 'ADFTutorialResourceGroup'

    # The data factory name. It must be globally unique.
    df_name = '<Specify a name for the data factory. It must be globally unique>'

    # Specify your Active Directory client ID, client secret, and tenant ID
    credentials = ServicePrincipalCredentials(client_id='<Active Directory application/client ID>', secret='<client secret>', tenant='<Active Directory tenant ID>')
    resource_client = ResourceManagementClient(credentials, subscription_id)
    adf_client = DataFactoryManagementClient(credentials, subscription_id)

    rg_params = {'location':'eastus'}
    df_params = {'location':'eastus'}

However I cannot pass the credentials in as shown above since azure login is carried out as a separate step earlier in the pipeline, leaving me with an authenticated session to azure (no other credentials may be passed into this script).

Before I run the python code to create the pipeline, I do "az login" via a Jenkins deployment pipeline, which gets me an authenticated azurerm session. I should be able to re-use this session in the python script to get a data factory client, without authenticating again.

However, I'm unsure how to modify the client creation part of the code, as there do not seem to be any examples that make use of an already established azurerm session:

    adf_client = DataFactoryManagementClient(credentials, subscription_id)

    rg_params = {'location':'eastus'}
    df_params = {'location':'eastus'}

 #Create a data factory
    df_resource = Factory(location='eastus')
    df = adf_client.factories.create_or_update(rg_name, df_name, df_resource)
    print_item(df)
    while df.provisioning_state != 'Succeeded':
        df = adf_client.factories.get(rg_name, df_name)
        time.sleep(1)

Microsofts authentication documentation suggests I can authenticate using a previously established session as follows:

from azure.common.client_factory import get_client_from_cli_profile
from azure.mgmt.compute import ComputeManagementClient

client = get_client_from_cli_profile(ComputeManagementClient)

( ref: https://docs.microsoft.com/en-us/python/azure/python-sdk-azure-authenticate?view=azure-python )

This works, however azure data factory object instantiation fails with:

Traceback (most recent call last):
  File "post-scripts/check-data-factory.py", line 72, in <module>
    main()
  File "post-scripts/check-data-factory.py", line 65, in main
    df = adf_client.factories.create_or_update(rg_name, data_factory_name, df_resource)

AttributeError: 'ComputeManagementClient' object has no attribute 'factories'

So perhaps some extra steps are required between this and getting a df object?

Any clue appreciated!

Just replace the class with the correct type:

from azure.common.client_factory import get_client_from_cli_profile
from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.datafactory import DataFactoryManagementClient

resource_client = get_client_from_cli_profile(ResourceManagementClient)
adf_client = get_client_from_cli_profile(DataFactoryManagementClient)

The error you got is because you created a Compute client (to handle VM), not a ADF client. But yes, you found the right doc for your needs :)

(disclosure: I work at MS in the Python SDK team)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM