简体   繁体   English

如何使用 Python 为 Azure 文件存储使用获取文件属性 REST API

[英]How to use Get File Properties REST API for Azure Files Storage using Python

I am trying to create a Python script which will utilize both the Python SDK for Azure and REST API's in order to extract information for files in my Azure Files Storage Account.我正在尝试创建一个 Python 脚本,该脚本将利用适用于 Azure 的 Python SDK 和 REST API 来提取我的 Azure 文件存储帐户中文件的信息。

I am using the SDK to access the files in storage and get there names.我正在使用 SDK 访问存储中的文件并获取名称。 Then using the name I want to be able to have a REST API call to get the file properties, specifically the Last-Modified property.然后使用我希望能够调用 REST API 来获取文件属性的名称,特别是 Last-Modified 属性。 I try to access the last modified property using the SDK but it always returns None for some reason.我尝试使用 SDK 访问最后修改的属性,但由于某种原因它总是返回 None 。

I want to use the last modified date to determine if it has been more than 24 hours and if it has then I want to delete the file.我想使用上次修改日期来确定它是否已经超过 24 小时,如果已经超过,那么我想删除该文件。 I am not sure if it possible to set some sort of auto delete after a certain period property on the file when i first create and upload it to azure.我不确定当我第一次创建文件并将其上传到 azure 时,是否可以在文件的某个时间段属性后设置某种自动删除。 If there is then this will solve my problems anyhow.如果有,那么无论如何这将解决我的问题。

I have posted the code I am using below.我已经在下面发布了我正在使用的代码。 When i try to make the HTTP request I get the error "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature."当我尝试发出 HTTP 请求时,我收到错误消息“服务器无法对请求进行身份验证。确保授权标头的值形成正确,包括签名。”

    import datetime
    import requests
    import json
    import base64
    import hmac
    import hashlib
    import urllib
    from azure.storage.file import *

StorageAccountConnectionString = ""
fileshareName = "testFileShare"
storage_account_name = "testStorage"
storage_account_key = ""
api_version = "2018-03-28"

        file_service = FileService(connection_string=StorageAccountConnectionString)
    listOfStateDirectories = file_service.list_directories_and_files(fileshareName)

    for state_directory in listOfStateDirectories:
        print("Cleaning up State Directory: " + state_directory.name)
        if(isinstance(state_directory, Directory)):
            listOfBridgeDirectories = file_service.list_directories_and_files(fileshareName, state_directory.name)
            for bridge_directory in listOfBridgeDirectories:
                if(isinstance(bridge_directory, Directory)):
                    print("Cleaning up Bridge Directory: " + bridge_directory.name)
                    path_to_bridge_directory = state_directory.name + "/" + bridge_directory.name
                    listOfFilesAndFolders = file_service.list_directories_and_files(fileshareName, path_to_bridge_directory)

                for file_or_folder in listOfFilesAndFolders:
                    if isinstance(file_or_folder, File):
                        name_of_file = file_or_folder.name

                        # Get the time of the current request
                        request_time = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')

                        string_to_append_to_url = fileshareName + '/' + path_to_bridge_directory + '/' + name_of_file
                        # Parse the url to make sure everything is good
                        # string_to_append_to_url = urllib.parse.quote(string_to_append_to_url)

                        string_params = {
                            'verb': 'HEAD',
                            'Content-Encoding': '',
                            'Content-Language': '',
                            'Content-Length': '',
                            'Content-MD5': '',
                            'Content-Type': '',
                            'Date': '',
                            'If-Modified-Since': '',
                            'If-Match': '',
                            'If-None-Match': '',
                            'If-Unmodified-Since': '',
                            'Range': '',
                            'CanonicalizedHeaders': 'x-ms-date:' + request_time + '\nx-ms-version:' + api_version + '\n',
                            'CanonicalizedResource': '/' + storage_account_name + '/' + string_to_append_to_url
                        }

                        string_to_sign = (string_params['verb'] + '\n'
                                          + string_params['Content-Encoding'] + '\n'
                                          + string_params['Content-Language'] + '\n'
                                          + string_params['Content-Length'] + '\n'
                                          + string_params['Content-MD5'] + '\n'
                                          + string_params['Content-Type'] + '\n'
                                          + string_params['Date'] + '\n'
                                          + string_params['If-Modified-Since'] + '\n'
                                          + string_params['If-Match'] + '\n'
                                          + string_params['If-None-Match'] + '\n'
                                          + string_params['If-Unmodified-Since'] + '\n'
                                          + string_params['Range'] + '\n'
                                          + string_params['CanonicalizedHeaders']
                                          + string_params['CanonicalizedResource'])

                        signed_string = base64.b64encode(hmac.new(base64.b64decode(storage_account_key), msg=string_to_sign.encode('utf-8'), digestmod=hashlib.sha256).digest()).decode()

                        headers = {
                            'x-ms-date': request_time,
                            'x-ms-version': api_version,
                            'Authorization': ('SharedKey ' + storage_account_name + ':' + signed_string)
                        }

                        url = ('https://' + storage_account_name + '.file.core.windows.net/' + string_to_append_to_url)
                        print(url)


                        r = requests.get(url, headers=headers)
                        print(r.content)

NOTE: Some of the directories will have white spaces so I am not sure if this is effecting the REST API call because the URL will also have spaces.注意:有些目录会有空格,所以我不确定这是否会影响 REST API 调用,因为 URL 也会有空格。 If it does effect it then how would i go about accessing those files whose URL's will contain spaces如果它确实影响它,那么我将如何访问那些 URL 将包含空格的文件

I try to access the last modified property using the SDK but it always returns None for some reason.我尝试使用 SDK 访问最后修改的属性,但由于某种原因它总是返回 None 。

Not all of SDK API and REST API will return the Last-Modified property in the headers of the response, which include REST API List Directories and Files and Python SDK API list_directories_and_files .并非所有 SDK API 和 REST API 都会在响应的标头中返回Last-Modified属性,其中包括 REST API List Directories and Files以及 Python SDK API list_directories_and_files

I tried to reproduce your issue using SDK, as the code below.我尝试使用 SDK 重现您的问题,如下面的代码。

generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
    if isinstance(file_or_dir, File):
        print(file_or_dir.name, file_or_dir.properties.last_modified)

Due to the list_directories_and_files method will not return any properties in the File object, so the file_or_dir.properties.last_modified value of the code above is None .由于list_directories_and_files方法不会返回File对象中的任何属性,所以上面代码的file_or_dir.properties.last_modified值为None

The REST APIs Get File , Get File Properties , Get File Metadata and the Python SDK APIs get_file_properties , get_file_metadata will return Last-Modified property in the headers of the response, so to change the code as below to get the last_modified property to make it works. REST APIs Get File , Get File Properties , Get File Metadata和 Python SDK APIs get_file_properties , get_file_metadata将在响应的标头中返回Last-Modified属性,因此更改如下代码以获取last_modified属性以使其工作.

generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
    if isinstance(file_or_dir, File):
        file_name = file_or_dir.name
        file = file_service.get_file_properties(share_name, directory_name, file_name, timeout=None, snapshot=None)
        print(file_or_dir.name, file.properties.last_modified)

Ofcouse, to call the REST API is as same as to use SDK API.当然,调用REST API和使用SDK API是一样的。 However, to build a SAS signature string is easy to make mistakes and not friendly for reading code.但是,构建SAS签名字符串容易出错,对代码阅读不友好。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用“获取文件属性”获取“content_type” - How to get "content_type" using Get File Properties REST API for Azure Files Storage using Python 使用 python 将文件加载到 Azure 文件存储 - Load files to Azure file storage using python 如何使用Rest API查询Azure表存储 - How to query Azure table storage using rest API Stream 文件到 Azure Blob 存储中的 Zip 文件,使用 ZA7F5F35426B9627411FC3Z231 - Stream Files to Zip File in Azure Blob Storage using Python? 如何在 databricks 工作区中使用 python 获取 azure 数据湖存储中每个文件的最后修改时间? - How to get the last modification time of each files present in azure datalake storage using python in databricks workspace? 如何使用Python将Azure Blob存储中的大型JSON文件拆分为每个记录的单个文件? - How can I split large JSON file in Azure Blob Storage into individual files for each record using Python? 我们如何使用 python 将存储容器中的输入文件提供给 azure 语音 api - How can we give the input file from storage container to azure speech api using python 如何使用Python中的Rest API对Azure数据目录进行身份验证并从中获取目录 - How do i authenticate to and get catalog from Azure Data Catalog using Rest API in Python 如何使用 Django 文件存储在 API 和 Worker 之间共享文件 - How to Use Django File Storage to Share Files Between an API and a Worker 通过 REST API & ZA7F5F35426B923682317B 连接到 Azure 存储模拟器 - Connect to Azure Storage Emulator via REST API & Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM