繁体 English 中英

如何将文件从 blob 存储读取到 azure 数据块，文件名中包含每日日期

[英]How to read a file from blob storage to azure databricks with daily date in the file name

原文 2022-08-22 05:13:42 7 1 azure/ pyspark/ apache-spark-sql/ databricks/ azure-databricks

我想从包含其他文件的 blob 存储容器中读取 Employee_detail_info 文件到 azure databrikcs 笔记本。 这些文件将每天从源加载到 blobstorage。

Employee_detail_Info_20220705000037
客户detais_info_20220625000038
allinvocie_details_20220620155736

1 个解决方案

您可以使用Glob 模式来实现要求。 下面是同样的演示。

以下是我的存储帐户中的文件列表。

Customersdetais_info_20220625000038.csv
Employee_detail_Info_20220705000037.csv
Employee_detail_Info_20220822000037.csv
Employee_detail_Info_20220822000054.csv
allinvocie_details_20220620155736.csv

#all employee files have same schema and 1 row each for demo

现在，为您的employee_details_info类型文件创建一个模式。 我已经使用datetime库来实现这一点。 由于每个员工文件的今天日期为yyyyMMdd ，因此我创建了一个指示相同的模式。

from datetime import datetime

todays_date = datetime.utcnow().strftime("%Y%m%d")
print(todays_date) #20220822

file_name_pattern = "Employee_detail_Info_"+todays_date
print(file_name_pattern) #Employee_detail_Info_20220822

现在您可以使用Asterisk (*) glob 模式来读取与我们的file_name_pattern匹配的所有文件。

df = spark.read.option("header",True).format("csv").load(f"/mnt/repro/{file_name_pattern}*.csv")
#you can specify,required file format and change the above accordingly.

df.show()

以下是我的output的图片供参考。

我的文件：
Output：

如何从 Azure Databricks Notebook 直接读取 Azure Blob 存储文件

[英]How can I read an Azure Blob Storage file direclty from an Azure Databricks Notebook

无法再将文件从 Databricks 保存到 Azure Blob 存储

[英]Can't save file from Databricks to Azure Blob Storage Anymore

.jpg 文件未从 blob 存储（Azure 数据湖）加载到数据块中

[英].jpg file not loading in databricks from blob storage (Azure data lake)

Azure Databricks在Blob存储上打开文件的问题

[英]Problems with Azure Databricks opening a file on the Blob Storage

如何将 a.hyper 文件（在 DataBricks 中）写入 Blob 存储（在 Azure 中）？

[英]How to Write a .hyper file (in DataBricks) to Blob Storage (in Azure)?

从 Azure Blob 存储读取 XML 文件

[英]Read an XML file from Azure Blob Storage

从Azure blob存储中读取文件

[英]Read file from Azure blob storage

从 python 中的 azure blob 存储读取文件

[英]read file from azure blob storage in python

如何使用 scala 并从 Azure blob 存储中读取文件？

[英]How to use scala and read file from Azure blob storage?

如何读取 azure blob 存储中的文件更改？

[英]How to read file changes in azure blob storage?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从 Azure Databricks Notebook 直接读取 Azure Blob 存储文件无法再将文件从 Databricks 保存到 Azure Blob 存储 .jpg 文件未从 blob 存储（Azure 数据湖）加载到数据块中 Azure Databricks在Blob存储上打开文件的问题如何将 a.hyper 文件（在 DataBricks 中）写入 Blob 存储（在 Azure 中）？从 Azure Blob 存储读取 XML 文件从Azure blob存储中读取文件从 python 中的 azure blob 存储读取文件如何使用 scala 并从 Azure blob 存储中读取文件？如何读取 azure blob 存储中的文件更改？

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM