![](/img/trans.png)
[英]Uploading files from Azure Blob Storage to SFTP location using Databricks?
[英]Databricks: Uploading a file to another location from Azure Blob Storage without copying it locally
我在 Azure Blob 存储中有一个文件,我想将其上传到另一个位置而不将其复制到 Databricks 的本地存储。
目前我的代码需要在上传前复制到本地:
# Set up connection to Azure Blob Storage
spark.conf.set("fs.azure.account.key.[some location]", "[account key]")
# Copies the file to Databricks local storage
dbutils.fs.cp("wasbs://[folder location]/some_file.csv", "temp_some_file.csv")
# Setting up for upload data to other system
uploader = client.create_dataset_from_upload('data', 'csv') # This is an external library call
# Read the local copy file and upload it to another system
with open('/dbfs/temp_some_file.csv') as dataset:
uploader.upload_file(dataset)
如何更改open()
命令以直接指向 Azure Blob 存储中的文件?
您可以 将容器安装在 DBFS 中:
storage = ...
container = ...
sas = '...'
dbutils.fs.mount(
source = f"wasbs://{container}@{storage}.blob.core.windows.net",
mount_point = "/mnt/uploader",
extra_configs = {f"fs.azure.sas.{container}.{storage}.blob.core.windows.net": sas}
)
因此可以在dbfs:/mnt/uploader
访问它。 而且,由于 DBFS 本身安装在/dbfs
的驱动程序/执行程序上,您将能够直接打开文件:
with open('/dbfs/mnt/uploader/some_file.csv', 'r') as dataset:
uploader.upload_file(dataset)
不要忘记卸载(除非你想要永久卸载)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.