简体   繁体   English

Azure Databricks:无法连接到 Azure Data Lake Storage Gen2

[英]Azure Databricks: can't connect to Azure Data Lake Storage Gen2

I have Storage account kagsa1 with container cont1 inside and need it to accessible (mounted) via Databricks我有存储帐户kagsa1 ,里面有容器cont1 ,需要通过 Databricks 访问(安装)它

If I use storage account key in KeyVault it works correctly:如果我在 KeyVault 中使用存储帐户密钥,它可以正常工作:

configs = {
    "fs.azure.account.key.kagsa1.blob.core.windows.net":dbutils.secrets.get(scope = "kv-db1", key = "storage-account-access-key")
}

dbutils.fs.mount(
  source = "wasbs://cont1@kagsa1.blob.core.windows.net",
  mount_point = "/mnt/cont1",
  extra_configs = configs)

dbutils.fs.ls("/mnt/cont1")

..but if I'm trying to connect using Azure Active Directory credentials: ..但如果我尝试使用 Azure Active Directory 凭据进行连接:

configs = {
"fs.azure.account.auth.type": "CustomAccessToken",
"fs.azure.account.custom.token.provider.class": spark.conf.get("spark.databricks.passthrough.adls.gen2.tokenProviderClassName")
}

dbutils.fs.ls("abfss://cont1@kagsa1.dfs.core.windows.net/")

..it fails: ..它失败:

ExecutionError: An error occurred while calling z:com.databricks.backend.daemon.dbutils.FSUtils.ls.
: GET https://kagsa1.dfs.core.windows.net/cont1?resource=filesystem&maxResults=5000&timeout=90&recursive=false
StatusCode=403
StatusDescription=This request is not authorized to perform this operation using this permission.
ErrorCode=AuthorizationPermissionMismatch
ErrorMessage=This request is not authorized to perform this operation using this permission.

Databrics Workspace tier is Premium, Databrics Workspace 层是 Premium,
Cluster has Azure Data Lake Storage Credential Passthrough option enabled,集群启用了 Azure Data Lake Storage Credential Passthrough 选项,
Storage account has hierarchical namespace option enabled,存储帐户启用了分层命名空间选项,
Filesystem was initialized with文件系统初始化为

spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "true")
dbutils.fs.ls("abfss://cont1@kagsa1.dfs.core.windows.net/")
spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "false")

and I have full access to container in storage account:我可以完全访问存储帐户中的容器: 在此处输入图像描述

What am I doing wrong?我究竟做错了什么?

Note: When performing the steps in the Assign the application to a role, make sure to assign the Storage Blob Data Contributor role to the service principal.注意:执行将应用程序分配给角色中的步骤时,请确保将存储 Blob 数据参与者角色分配给服务主体。

As part of repro, I have provided owner permission to the service principal and tried to run the “ dbutils.fs.ls("mnt/azure/") ”, returned same error message as above.作为重现的一部分,我已向服务主体提供所有者权限并尝试运行“ dbutils.fs.ls("mnt/azure/") ”,返回与上述相同的错误消息。

在此处输入图像描述

Now assigned the Storage Blob Data Contributor role to the service principal.现在将存储 Blob 数据参与者角色分配给服务主体。

在此处输入图像描述

Finally, able to get the output without any error message after assigning Storage Blob Data Contributor role to the service principal.最后,在将存储 Blob 数据参与者角色分配给服务主体后,能够获得 output 而没有任何错误消息。

在此处输入图像描述

For more details, refer “ Tutorial: Azure Data Lake Storage Gen2, Azure Databricks & Spark ”.有关更多详细信息,请参阅“ 教程:Azure Data Lake Storage Gen2、Azure Databricks & Spark ”。

Reference: Azure Databricks - ADLS Gen2 throws 403 error message .参考: Azure Databricks - ADLS Gen2 throws 403 error message

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Azure 数据湖存储 Gen2 权限 - Azure Data Lake storage Gen2 permissions Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2 - Azure Databricks accessing Azure Data Lake Storage Gen2 via Service principal 不能对 Azure Data Lake Gen2 文件使用通配符 - Can't use wildcard with Azure Data Lake Gen2 files 使用 Databricks /mnt 安装 Azure Data lake Gen2 - Mounting Azure Data lake Gen2 with Databricks /mnt Azure 的“Data Lake Storage Gen2”和“Data Lake Gen2”有什么区别? - What is the difference between Azure's "Data Lake Storage Gen2" and "Data Lake Gen2"? Azure Data Lake Gen2 与存储帐户 - Azure Data Lake Gen2 vs Storage account Azure Data Lake Storage Gen2 创建目录(如果 python 中不存在) - Azure Data Lake Storage Gen2 create directory if not exists in python 如何使用租户 ID、客户端 ID 和客户端机密连接和管理 Azure Data Lake Storage Gen2 中的目录和文件? - How can I use tenant id, client id and client secret to connect to and manage directories and files in Azure Data Lake Storage Gen2? Microsoft Azure Data Lake 存储 (Gen2) 中的分层命名空间是什么? - What is hierarchical namespace in Microsoft Azure Data Lake storage (Gen2)? 使用帐户密钥访问 Azure Data Lake Storage Gen2 - Access Azure Data Lake Storage Gen2 using the account key
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM