简体   繁体   English

如何在 Azure 存储容器中创建目录而不创建额外文件?

[英]How to create directories in Azure storage container without creating extra files?

I've created python code to create a range of folders and subfolders (for data lake) in an Azure storage container.我创建了 python 代码以在 Azure 存储容器中创建一系列文件夹和子文件夹(用于数据湖)。 The code works and is based on the documentation on Microsoft Azure.该代码有效并且基于 Microsoft Azure 上的文档。 One thing though is that I'm creating a dummy 'txt' file in the folders in order to create the directory (which I can clean up later).但有一件事是我在文件夹中创建了一个虚拟的“txt”文件以创建目录(我可以稍后清理)。 I was wondering if there's a way to create the folders and subfolders without creating a file.我想知道是否有一种方法可以在不创建文件的情况下创建文件夹和子文件夹。 I understand that the folders in Azure container storage are not hierarchical and are instead metadata and what I'm asking for may not be possible?我知道 Azure 容器存储中的文件夹不是分层的,而是元数据,我要求的可能是不可能的?

connection_string = config['azure_storage_connectionstring']
gen2_container_name = config['gen2_container_name']
container_client = ContainerClient.from_connection_string(connection_string, gen2_container_name)
blob_service_client = BlobServiceClient.from_connection_string(connection_string)

# blob_service_client.create_container(gen2_container_name)


def create_folder(folder, sub_folder):
    blob_client = container_client.get_blob_client('{}/{}/start_here.txt'.format(folder, sub_folder)) 

    with open ('test.txt', 'rb') as data:
        blob_client.upload_blob(data)



def create_all_folders():
    config = load_config()
    folder_list = config['folder_list']
    sub_folder_list = config['sub_folder_list']
    for folder in folder_list:
        for sub_folder in sub_folder_list:
            try:
                create_folder(folder, sub_folder)
            except Exception as e:
                print ('Looks like something went wrong here trying to create this folder structure {}/{}. Maybe the structure already exists?'.format(folder, sub_folder))

I've created python code to create a range of folders and subfolders (for data lake) in an Azure storage container.我创建了 python 代码以在 Azure 存储容器中创建一系列文件夹和子文件夹(用于数据湖)。 The code works and is based on the documentation on Microsoft Azure.该代码有效并且基于 Microsoft Azure 上的文档。 One thing though is that I'm creating a dummy 'txt' file in the folders in order to create the directory (which I can clean up later).但有一件事是我在文件夹中创建了一个虚拟的“txt”文件以创建目录(我可以稍后清理)。 I was wondering if there's a way to create the folders and subfolders without creating a file.我想知道是否有一种方法可以在不创建文件的情况下创建文件夹和子文件夹。 I understand that the folders in Azure container storage are not hierarchical and are instead metadata and what I'm asking for may not be possible?我知道 Azure 容器存储中的文件夹不是分层的,而是元数据,我要求的可能是不可能的?

No, for blob storage, this is not possible.不,对于 blob 存储,这是不可能的。 There is no way to create so-called "folders"没有办法创建所谓的“文件夹”

But you can use data-lake SDK like this to create directory:但是您可以像这样使用数据湖 SDK 来创建目录:

from azure.storage.filedatalake import DataLakeServiceClient 
connect_str = "DefaultEndpointsProtocol=https;AccountName=0730bowmanwindow;AccountKey=xxxxxx;EndpointSuffix=core.windows.net"
datalake_service_client = DataLakeServiceClient.from_connection_string(connect_str)
myfilesystem = "test"
myfolder     = "test1111111111"
myfile       = "FileName.txt"

file_system_client = datalake_service_client.get_file_system_client(myfilesystem)            
directory_client = file_system_client.create_directory(myfolder)    

Just to add some context, the reason this is not possible in Blob Storage is that folders/directories are not "real".只是为了添加一些上下文,这在 Blob 存储中不可能的原因是文件夹/目录不是“真实的”。 Folders do not exist as standalone objects, they are only defined as part of a blob name.文件夹不作为独立对象存在,它们仅被定义为 blob 名称的一部分。

For example, if you have a folder "mystuff" with a file (blob) "somefile.txt", the blob name actually includes the folder name and "/" character like mystuff/somefile.txt .例如,如果您有一个文件夹“mystuff”,其中包含一个文件 (blob) “somefile.txt”,那么 blob 名称实际上包括文件夹名称和“/”字符,例如mystuff/somefile.txt The blob exists directly inside the container, not inside a folder. Blob 直接存在于容器内,而不是文件夹内。 This naming convention can be nested many times over in a blob name like folder1/folder2/mystuff/anotherfolder/somefile.txt , but that blob still only exists directly in the container.此命名约定可以多次嵌套在诸如folder1/folder2/mystuff/anotherfolder/somefile.txt 之类的 blob 名称中,但该 blob 仍仅直接存在于容器中。

Folders can appear to exist in certain tooling (like Azure Storage Explorer ) because the SDK permits blob name filtering: if you do so on the "/" character, you can mimic the appearance of a folder and its contents.某些工具(如Azure 存储资源管理器)中可能存在文件夹,因为 SDK 允许 blob 名称过滤:如果您在“/”字符上执行此操作,则可以模仿文件夹及其内容的外观。 But in order for a folder to even appear to exist, there must be blob in the container with the appropriate name.但是为了使文件夹看起来存在,容器中必须有具有适当名称的 blob。 If you want to "force" a folder to exist, you can create a 0-byte blob with the correct folder path in the name, but the blob artifact will still need to exist.如果您想“强制”一个文件夹存在,您可以创建一个名称中包含正确文件夹路径的 0 字节 blob,但该 blob 工件仍然需要存在。

The exception is Azure Data Lake Storage (ADLS) Gen 2 , which is Blob Storage that implements a Hierarchical Namespace .例外情况是Azure Data Lake Storage (ADLS) Gen 2 ,它是实现分层命名空间的 Blob 存储。 This makes it more like a file system and so respects the concept of Directories as standalone objects.这使它更像一个文件系统,因此尊重目录作为独立对象的概念。 ADLS is built on Blob Storage, so there is a lot of parity between the two. ADLS 是建立在 Blob 存储之上的,因此两者之间有很多奇偶性。 If you absolutely must have empty directories, then ADLS is the way to go.如果您绝对必须有空目录,那么 ADLS 就是 go 的方法。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 python 脚本中创建文件而不将其存储在本地目录中并将其保存在 azure blob 存储上的不同容器中? - How to create a file in a python script without storing it in a local directory and save it in a different container on azure blob storage? 如何从 Azure Functions 中的存储容器读取多个文件 - How to read multiple files from a storage container in Azure Functions 如何使用 python sdk 将 blob 上传到带有子目录的 azure 存储容器中? - How to upload a blob into azure storage container with sub directories using the python sdk? Django未将图像文件上传到Azure存储容器 - Django not uploading image files to Azure storage container 无法使用python创建Azure存储容器 - Can't create azure storage container with python 如果不存在,则在 azure 存储中创建 blob 容器 - Create blob container in azure storage if it is not exists 一种无需覆盖即可创建文件和目录的方法 - A way to create files and directories without overwriting python在docker容器中创建目录并写入文件 - python creating directories and writing files inside docker container 如何使用 Python 从给定 SAS URI 和容器名称的 Azure Blob 存储下载文件列表? - How to download a list of files from Azure Blob Storage given SAS URI and container name using Python? 我们如何使用 python 从存储容器 azure 读取文件 - How can we read files from storage container azure using python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM