简体   繁体   English

使用/ mnt /将数据从Azure Blob存储读取到Azure Databricks

[英]Reading data from Azure Blob Storage into Azure Databricks using /mnt/

I've successfully mounted my blob storage to Databricks, and can see the defined mount point when running dbutils.fs.ls("/mnt/") . 我已经成功地将我的blob存储装载到Databricks,并且在运行dbutils.fs.ls("/mnt/")时可以看到定义的挂载点。 This has size=0 - it's not clear if this is expected or not. 这个size=0 - 不清楚这是否是预期的。

When I try and run dbutils.fs.ls("/mnt/<mount-name>") , I get this error: java.io.FileNotFoundException: / is not found 当我尝试运行dbutils.fs.ls("/mnt/<mount-name>") ,我收到此错误: java.io.FileNotFoundException: / is not found

When I try and write a simple file to my mounted blob with dbutils.fs.put("/mnt/<mount-name>/1.txt", "Hello, World!", True) , I get the following error (shortened for readability): 当我尝试使用dbutils.fs.put("/mnt/<mount-name>/1.txt", "Hello, World!", True)将一个简单文件写入我的挂载blob时, dbutils.fs.put("/mnt/<mount-name>/1.txt", "Hello, World!", True)以下错误(缩短了可读性):

ExecutionError: An error occurred while calling z:com.databricks.backend.daemon.dbutils.FSUtils.put. : shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.util.NoSuchElementException: An error occurred while enumerating the result, check the original exception for details.
...
Caused by: com.microsoft.azure.storage.StorageException: The specified resource does not exist.

All the data is in the root of the Blob container, so I have not defined any folder structures in the dbutils.fs.mount code. 所有数据都在Blob容器的根目录中,因此我没有在dbutils.fs.mount代码中定义任何文件夹结构。

thinking emoji 思维表情符号

The solution here is making sure you are using the 'correct' part of your Shared Access Signature (SAS). 此处的解决方案是确保您使用的是共享访问签名(SAS)的“正确”部分。 When the SAS is generated, you'll find there are lots of different parts of it that you can use - it's likely sent to you as one long connection string, eg: 生成SAS时,您会发现它可以使用很多不同的部分 - 它可能作为一个长连接字符串发送给您,例如:

BlobEndpoint=https://<storage-account>.blob.core.windows.net/;QueueEndpoint=https://<storage-account>.queue.core.windows.net/;FileEndpoint=https://<storage-account>.file.core.windows.net/;TableEndpoint=https://<storage-account>.table.core.windows.net/;SharedAccessSignature=sv=<date>&ss=nwrt&srt=sco&sp=rsdgrtp&se=<datetime>&st=<datetime>&spr=https&sig=<long-string>

When you define your mount point, use the value of the SharedAccessSignature key, eg: 定义安装点时,请使用SharedAccessSignature键的值,例如:

sv=<date>&ss=nwrt&srt=sco&sp=rsdgrtp&se=<datetime>&st=<datetime>&spr=https&sig=<long-string>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 Azure Databricks 将数据写入 Azure Blob 存储 - Writing Data to Azure Blob Storage from Azure Databricks 使用 Databricks PySpark 从 Azure blob 存储读取多个 CSV 文件 - Reading multiple CSV files from Azure blob storage using Databricks PySpark 使用 Databricks 将文件从 Azure Blob 存储上传到 SFTP 位置? - Uploading files from Azure Blob Storage to SFTP location using Databricks? 使用 azure 数据块 scala 将数据从 blob 存储加载到 sql 数据仓库 - Data load from blob storage to sql data warehouse using azure databricks scala Azure Blob 存储和 Azure 数据块之间的高效数据检索过程 - Efficient data retrieval process between Azure Blob storage and Azure databricks Azure Databricks 装载 Blob 存储 - Azure Databricks mounting a blob storage 将 DataBrick 连接到 Azure Blob 存储 - Connecting DataBricks to Azure Blob Storage Databricks上的PySpark:读取从Azure Blob存储复制的CSV文件会导致java.io.FileNotFoundException - PySpark on Databricks: Reading a CSV file copied from the Azure Blob Storage results in java.io.FileNotFoundException 使用pyspark从Azure Blob存储读取(txt,csv)文件 - Reading (txt , csv) FIle from Azure blob storage using pyspark 从 azure blob 存储中读取多行 - Reading a number of lines from azure blob storage
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM