简体   繁体   English

Azure 数据湖存储 Gen2 权限

[英]Azure Data Lake storage Gen2 permissions

I am currently building a data lake (Gen2) in Azure.我目前正在 Azure 中构建数据湖(Gen2)。 I use Terraform to provision all the resources.我使用 Terraform 来配置所有资源。 However, I ran into some permission inconsistencies.但是,我遇到了一些权限不一致的问题。 According to the documentation , one can set permissions for the data lake with RBAC and ACLs.根据文档,可以使用 RBAC 和 ACL 为数据湖设置权限。

My choice is to use ACLs since it allows for fine-grained permissions on directories within the data lake.我的选择是使用 ACL,因为它允许对数据湖中的目录进行细粒度的权限。 In the data lake, I created a directory raw among other directories for which a certain group has r-- (read only) default permissions.在数据湖中,我在其他目录中创建了一个raw目录,其中某个group具有r-- --(只读)默认权限。 The default means that all the objects under this directory are assigned the same permissions as the permissions on the directory. default意味着该目录下的所有对象都被分配了与该目录上的权限相同的权限。 When users in that group are trying to access the data lake with Storage Explorer, they do not see a storage account and they do not see the actual filesystem/container in which the directory lives.当该组中的用户尝试使用存储资源管理器访问数据湖时,他们看不到存储帐户,也看不到目录所在的实际文件系统/容器。 So they are not able to access the directory for which they have read-only permissions.因此他们无法访问他们拥有只读权限的目录。

So I was thinking of assigning the permissions needed to at least list storage accounts and filesystems (containers).所以我正在考虑分配至少列出存储帐户和文件系统(容器)所需的权限。 Evaluating existing roles, I came to the following permissions:评估现有角色,我得到以下权限:

  1. Microsoft.Storage/storageAccounts/listKeys/action
  2. Microsoft.Storage/storageAccounts/read

After applying permission 1, nothing changed.应用权限 1 后,没有任何变化。 After applying permission 2 as well, users in the group could suddenly do everything in the data lake as if there was no ACL specified.同样应用权限 2 后,组中的用户突然可以在数据湖中执行所有操作,就好像没有指定 ACL 一样。

My question now is: how can I use ACLs (and RBAC) to create a data lake with directories with different permissions for different groups, so that groups are actually able to only read or write to those directories that are in the ACLs?我现在的问题是:如何使用 ACL(和 RBAC)创建一个数据湖,其中包含对不同组具有不同权限的目录,以便组实际上只能读取或写入 ACL 中的那些目录? In addition, they should be able to list storage accounts and filesystems (containers) for which they have access to certain directories.此外,他们应该能够列出他们有权访问某些目录的存储帐户和文件系统(容器)。

I believe you also need to create access ACLs on the entire hierarchy of folders down to the file or folder you are trying to read, including the root container.我相信您还需要在整个文件夹层次结构上创建访问 ACL,直至您尝试读取的文件或文件夹,包括根容器。

So if your folder "raw" was created in the top level then you'll need to create the following ACLs for that group...因此,如果您的文件夹“raw”是在顶层创建的,那么您需要为该组创建以下 ACL...

"/"    --x (access)
"/raw" r-x (access)
"/raw" r-x (default)

... and the default ACL will then give the group the read and execute ACL on all sub folders and files created. ...然后默认 ACL 将为组提供对所有创建的子文件夹和文件的读取和执行 ACL。

You also need to give the group at least Reader RBAC permission on the resource - this can either be on the storage account, on just on the container if you want to restrict access to other containers.您还需要为组至少授予对该资源的 Reader RBAC 权限 - 如果您想限制对其他容器的访问,这可以在存储帐户上,也可以在容器上。

You can set the ACLs on container with the ace property of the azurerm_storage_data_lake_gen2_filesystem Terraform resource and then set the ACLs on the folders using the azurerm_storage_data_lake_gen2_path Terraform resource.您可以使用azurerm_storage_data_lake_gen2_filesystem Terraform 资源的ace属性在容器上设置 ACL,然后使用azurerm_storage_data_lake_gen2_path Z303E96F80576360D0C7B07AE7528FAB4 资源在文件夹上设置 ACL。

Here's an example where I'm storing the object_id of the Azure Active Directory in a variable named aad_group_object_id.这是一个示例,我将 Azure Active Directory 的 object_id 存储在名为 aad_group_object_id 的变量中。

# create the data lake
resource "azurerm_storage_account" "data_lake" {
  ....
}

# create a container named "acltest" with execute ACL for the group
resource "azurerm_storage_data_lake_gen2_filesystem" "data_lake_acl_test" {
  name               = "acltest"
  storage_account_id = azurerm_storage_account.data_lake.id
  
  ace {
    type = "group"
    scope = "access"
    id = var.aad_group_object_id
    permissions = "--x"
  }
}

# create the folder "raw" and give read and execute access and default permissions to group
resource "azurerm_storage_data_lake_gen2_path" "folder_raw" {
  path               = "raw"
  filesystem_name    = azurerm_storage_data_lake_gen2_filesystem.data_lake_acl_test.name
  storage_account_id = azurerm_storage_account.data_lake.id
  resource           = "directory"
  ace {
    type = "group"
    scope = "access"
    id = var.aad_group_object_id
    permissions = "r-x"
  }
  ace {
    type = "group"
    scope = "default"
    id = var.aad_group_object_id
    permissions = "r-x"
  }
}

I've not included it in the code example, but you'll also have to add the ACLs for the owning group, owner, mask and other identities that get added to the root container and sub folders.我没有将它包含在代码示例中,但您还必须为拥有组、所有者、掩码和其他添加到根容器和子文件夹的身份添加 ACL。 Otherwise you'll keep seeing in your Terraform plan that it tries to drop and recreate them each time.否则,您会在 Terraform 计划中看到它每次都尝试删除并重新创建它们。

You can just added this - unfortunately you need to add it to every folder you create, unless anyone knows a way around this.您可以添加它 - 不幸的是,您需要将其添加到您创建的每个文件夹中,除非有人知道解决此问题的方法。

  ace {
    permissions = "---" 
    scope       = "access"
    type        = "other"
  }
  ace {
    permissions = "r-x"
    scope       = "access"
    type        = "group"
  }
  ace {
    permissions = "r-x"
    scope       = "access"
    type        = "mask"
  }
  ace {
    permissions = "rwx"
    scope       = "access"
    type        = "user"
  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Azure 的“Data Lake Storage Gen2”和“Data Lake Gen2”有什么区别? - What is the difference between Azure's "Data Lake Storage Gen2" and "Data Lake Gen2"? 获取 Azure Data Lake Gen2 (ACL) 的特定用户权限 - Get specific user permissions on Azure Data Lake Gen2 (ACL) Microsoft Azure Data Lake 存储 (Gen2) 中的分层命名空间是什么? - What is hierarchical namespace in Microsoft Azure Data Lake storage (Gen2)? Azure Data Lake Gen2 与存储帐户 - Azure Data Lake Gen2 vs Storage account Azure Data Lake Storage Gen2 创建目录(如果 python 中不存在) - Azure Data Lake Storage Gen2 create directory if not exists in python 如何借助“azure-storage”Package 在 Azure Data Lake Storage Gen2 中的容器内创建文件夹 - How to create a folder inside container in Azure Data Lake Storage Gen2 with the help of 'azure-storage' Package Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2 - Azure Databricks accessing Azure Data Lake Storage Gen2 via Service principal Azure Databricks:无法连接到 Azure Data Lake Storage Gen2 - Azure Databricks: can't connect to Azure Data Lake Storage Gen2 我们能否使用Azure CLI将文件上传到Azure Data Lake Storage Gen2 - Can we use Azure CLI to upload files to Azure Data Lake Storage Gen2 将增量数据从 AWS S3 复制到 Azure Data Lake Storage Gen2 失败 - Copy delta data from AWS S3 to Azure Data Lake Storage Gen2 failed
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM