简体   繁体   English

无法从 Databricks 笔记本的 Azure 存储容器中删除目录

[英]Not able to delete directory from Azure Storage container by Databricks notebook

I'm trying to delete empty directories from Azure storage container which mounted to my DBFS我正在尝试从安装到我的 DBFS 的 Azure 存储容器中删除空目录

I'm able to list all directories which has no files.我能够列出所有没有文件的目录。

%sh
find /dbfs/mnt/test/logs/2021 -empty -type d

Result:结果:

/dbfs/mnt/test/logs/2021/02/12
/dbfs/mnt/test/logs/2021/02/15
/dbfs/mnt/test/logs/2021/02/16

But when I try to delete them, it is failing with Resource temporary unavailable.但是当我尝试删除它们时,由于资源暂时不可用而失败。

%sh
find /dbfs/mnt/test/logs/ -type d -exec rmdir {} \; 

Result:结果:

rmdir: failed to remove '/dbfs/mnt/test/logs/': Directory not empty
rmdir: failed to remove '/dbfs/mnt/test/logs/2021': Directory not empty
rmdir: failed to remove '/dbfs/mnt/test/logs/2021/02': Directory not empty
rmdir: failed to remove '/dbfs/mnt/test/logs/2021/02/12': Resource temporarily unavailable

I'm able to successfully remove files older than certain days.. removing direcotry is not working.我能够成功删除某些天以前的文件。删除目录不起作用。 (Below command to remove files working (以下命令删除正在工作的文件

%sh
find /dbfs/mnt/test/logs/ -name "*.log" -type f -mtime +5 -exec rm -f {} \; 

First thing to remember - DBFS is an abstraction over the cloud blob storage, where there is no real directories - they are just prefixes that are used to organize data.首先要记住 - DBFS 是对云 blob 存储的抽象,其中没有真正的目录 - 它们只是用于组织数据的前缀。 And if you do %sh ls -ls /dbfs/mnt/test/logs/ you may notice that all directories will have the same timestamp, and it could be the recent one - I don't remember out the head how it's calculated.如果您执行%sh ls -ls /dbfs/mnt/test/logs/您可能会注意到所有目录都将具有相同的时间戳,并且可能是最近的一个 - 我不记得它是如何计算的了。 Only files have the timestamp.只有文件有时间戳。

So if you need to reliably remove directories, it's better to use dbutils.fs.rm('/mnt/test/logs/', True) (in Python, or similar in the Scala) to remove directory recursively (see docs ).因此,如果您需要可靠地删除目录,最好使用dbutils.fs.rm('/mnt/test/logs/', True) (在 Python 中,或在 Scala 中类似)递归地删除目录(参见docs )。 But there are limitations, like there is no support for wildcards, etc., so you need to generate a list of directories to delete, and do the deletion.但是有一些限制,比如不支持通配符等,所以需要生成要删除的目录列表,然后进行删除。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从存储帐户创建 Azure databricks 笔记本 - Create Azure databricks notebook from storage account Azure - 为存储容器中的每个新 blob 触发 Databricks 笔记本 - Azure - Trigger Databricks notebook for each new blob in Storage container 无法从 azure 数据块中的存储帐户读取容器内的增量镶木地板文件 - Not able to read delta parquet files inside a container from storage account in azure databricks 无法通过Powershell删除Azure存储目录 - Not able to delete Azure Storage Directory through Powershell 使用 Azure 数据块从 FileServer 上传到 Azure 存储容器 - Upload to Azure Storage container from FileServer using Azure databricks 如何从 Azure Databricks Notebook 直接读取 Azure Blob 存储文件 - How can I read an Azure Blob Storage file direclty from an Azure Databricks Notebook 无法从 Azure 自动化 Runbook 访问 Azure FileShare 存储容器 - Not able to access Azure FileShare Storage container from Azure Automation Runbook 从 Databricks 笔记本中的 Azure Data Lake Storage Gen1 获取嵌套文件夹的大小 - Fetch the size of nested folder from Azure Data Lake Storage Gen1 from Databricks notebook 如何使用 Azure Active Directory (AAD) 从 Azure blob 将数据读入数据块笔记本 - How to read data into a databricks notebook from Azure blob using Azure Active Directory (AAD) 无法从 Azure 存储帐户中删除 blob - Azure Function (Java) - Not able to delete blob from Azure storage Account - Azure Function (Java)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM