简体   繁体   中英

How do I retrieve all directory paths from Azure Data Lake Storage Gen2 using Python?

I'm trying to retrive all paths to directories in Azure Data Lake Storage Gen2 using the approach mentioned here . Here's my code:

# Connect to account
def initialize_storage_account_ad(storage_account_name, client_id, client_secret, tenant_id):
    
    try:  
        global service_client

        credential = ClientSecretCredential(tenant_id, client_id, client_secret)

        service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
            "https", storage_account_name), credential=credential)
    
    except Exception as e:
        print(e)

# List Directory Contents
def list_directory_contents():
    try:
        
        file_system_client = service_client.get_file_system_client(file_system="my-file-system")

        paths = file_system_client.get_paths(path="my-directory")

        for path in paths:
            print(path.name + '\n')

    except Exception as e:
     print(e)

Using the FileSystemClient.get_paths method retrieves paths to both files and directories.

Is there an efficent workaround to retrieve or filter only directory paths?

Please Advise.

get_paths returns a generator to list the paths(could be files or directories) under the specified file system, which also contains properties for each path.

path.is_directory == True helps differentiate directories from files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM