简体   繁体   中英

How to use Get File Properties REST API for Azure Files Storage using Python

I am trying to create a Python script which will utilize both the Python SDK for Azure and REST API's in order to extract information for files in my Azure Files Storage Account.

I am using the SDK to access the files in storage and get there names. Then using the name I want to be able to have a REST API call to get the file properties, specifically the Last-Modified property. I try to access the last modified property using the SDK but it always returns None for some reason.

I want to use the last modified date to determine if it has been more than 24 hours and if it has then I want to delete the file. I am not sure if it possible to set some sort of auto delete after a certain period property on the file when i first create and upload it to azure. If there is then this will solve my problems anyhow.

I have posted the code I am using below. When i try to make the HTTP request I get the error "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature."

    import datetime
    import requests
    import json
    import base64
    import hmac
    import hashlib
    import urllib
    from azure.storage.file import *

StorageAccountConnectionString = ""
fileshareName = "testFileShare"
storage_account_name = "testStorage"
storage_account_key = ""
api_version = "2018-03-28"

        file_service = FileService(connection_string=StorageAccountConnectionString)
    listOfStateDirectories = file_service.list_directories_and_files(fileshareName)

    for state_directory in listOfStateDirectories:
        print("Cleaning up State Directory: " + state_directory.name)
        if(isinstance(state_directory, Directory)):
            listOfBridgeDirectories = file_service.list_directories_and_files(fileshareName, state_directory.name)
            for bridge_directory in listOfBridgeDirectories:
                if(isinstance(bridge_directory, Directory)):
                    print("Cleaning up Bridge Directory: " + bridge_directory.name)
                    path_to_bridge_directory = state_directory.name + "/" + bridge_directory.name
                    listOfFilesAndFolders = file_service.list_directories_and_files(fileshareName, path_to_bridge_directory)

                for file_or_folder in listOfFilesAndFolders:
                    if isinstance(file_or_folder, File):
                        name_of_file = file_or_folder.name

                        # Get the time of the current request
                        request_time = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')

                        string_to_append_to_url = fileshareName + '/' + path_to_bridge_directory + '/' + name_of_file
                        # Parse the url to make sure everything is good
                        # string_to_append_to_url = urllib.parse.quote(string_to_append_to_url)

                        string_params = {
                            'verb': 'HEAD',
                            'Content-Encoding': '',
                            'Content-Language': '',
                            'Content-Length': '',
                            'Content-MD5': '',
                            'Content-Type': '',
                            'Date': '',
                            'If-Modified-Since': '',
                            'If-Match': '',
                            'If-None-Match': '',
                            'If-Unmodified-Since': '',
                            'Range': '',
                            'CanonicalizedHeaders': 'x-ms-date:' + request_time + '\nx-ms-version:' + api_version + '\n',
                            'CanonicalizedResource': '/' + storage_account_name + '/' + string_to_append_to_url
                        }

                        string_to_sign = (string_params['verb'] + '\n'
                                          + string_params['Content-Encoding'] + '\n'
                                          + string_params['Content-Language'] + '\n'
                                          + string_params['Content-Length'] + '\n'
                                          + string_params['Content-MD5'] + '\n'
                                          + string_params['Content-Type'] + '\n'
                                          + string_params['Date'] + '\n'
                                          + string_params['If-Modified-Since'] + '\n'
                                          + string_params['If-Match'] + '\n'
                                          + string_params['If-None-Match'] + '\n'
                                          + string_params['If-Unmodified-Since'] + '\n'
                                          + string_params['Range'] + '\n'
                                          + string_params['CanonicalizedHeaders']
                                          + string_params['CanonicalizedResource'])

                        signed_string = base64.b64encode(hmac.new(base64.b64decode(storage_account_key), msg=string_to_sign.encode('utf-8'), digestmod=hashlib.sha256).digest()).decode()

                        headers = {
                            'x-ms-date': request_time,
                            'x-ms-version': api_version,
                            'Authorization': ('SharedKey ' + storage_account_name + ':' + signed_string)
                        }

                        url = ('https://' + storage_account_name + '.file.core.windows.net/' + string_to_append_to_url)
                        print(url)


                        r = requests.get(url, headers=headers)
                        print(r.content)

NOTE: Some of the directories will have white spaces so I am not sure if this is effecting the REST API call because the URL will also have spaces. If it does effect it then how would i go about accessing those files whose URL's will contain spaces

I try to access the last modified property using the SDK but it always returns None for some reason.

Not all of SDK API and REST API will return the Last-Modified property in the headers of the response, which include REST API List Directories and Files and Python SDK API list_directories_and_files .

I tried to reproduce your issue using SDK, as the code below.

generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
    if isinstance(file_or_dir, File):
        print(file_or_dir.name, file_or_dir.properties.last_modified)

Due to the list_directories_and_files method will not return any properties in the File object, so the file_or_dir.properties.last_modified value of the code above is None .

The REST APIs Get File , Get File Properties , Get File Metadata and the Python SDK APIs get_file_properties , get_file_metadata will return Last-Modified property in the headers of the response, so to change the code as below to get the last_modified property to make it works.

generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
    if isinstance(file_or_dir, File):
        file_name = file_or_dir.name
        file = file_service.get_file_properties(share_name, directory_name, file_name, timeout=None, snapshot=None)
        print(file_or_dir.name, file.properties.last_modified)

Ofcouse, to call the REST API is as same as to use SDK API. However, to build a SAS signature string is easy to make mistakes and not friendly for reading code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM