简体   繁体   English

在未授予 GCS 访问权限的情况下使用外部表

[英]Using external tables without granting access to GCS

We have 2 GCP projects project-a and project-b .我们有 2 个 GCP 项目project-aproject-b And we want to give access to some external tables in the first project to users in second project using Authorized Views.我们希望使用授权视图向第二个项目中的用户授予对第一个项目中某些外部表的访问权限。

Here's what we've done so far:这是我们到目前为止所做的:

  • create a couple of BigQuery tables in project-a (private dataset) as external tables from GCS parquet filesproject-a (私有数据集)中创建几个 BigQuery 表作为来自 GCS parquet 文件的外部表
  • create a dataset (public dataset) in project project-b in which we created authorized views on the external tables from project project-a在项目project-b中创建一个数据集(公共数据集),其中我们在项目 project project-a a 的外部表上创建了授权视图

However if we give access to users in the project project-b to query the public views they receive this error:但是,如果我们授予项目project-b中的用户访问权限以查询公共视图,他们会收到此错误:

Access Denied: BigQuery BigQuery: Permission denied while globbing file pattern.访问被拒绝:BigQuery BigQuery:通配文件模式时权限被拒绝。

I know this means they should also have read permission on GCS buckets of project-a but we can't grant this permission in GCS.我知道这意味着他们还应该对project-a GCS 存储桶具有读取权限,但我们不能在 GCS 中授予此权限。

Is there a way to achieve this?有没有办法做到这一点? Or maybe another way of doing?或者,也许另一种方式做?

AFAIK, you must have the permission to access to the external data location to access the data (GCS, Google Sheet, or whatever are located the external data). AFAIK,您必须有权访问外部数据位置才能访问数据(GCS、Google 表格或位于外部数据的任何位置)。 There is no trick for that没有诀窍

This is now possible using BigLake tables .现在可以使用BigLake 表 We simply need to create a connection resource in BigQuery then use it to define an external table.我们只需要在 BigQuery 中创建一个连接资源,然后使用它来定义一个外部表。 Users now only require access to BigQuery tables, no need to set permissions in data location (GCS here).用户现在只需访问 BigQuery 表,无需在数据位置(此处为 GCS)设置权限。

  1. Create the connection using cloud shell bq command使用 cloud shell bq命令创建连接
bq mk --connection --location=REGION --project_id=PROJECT_ID \
    --connection_type=CLOUD_RESOURCE CONNECTION_ID
  1. Show the connection and get the associated service account显示连接并获取关联的服务帐户
bq show --connection PROJECT_ID.REGION.CONNECTION_ID
  1. Grant the resource connection service account access to Cloud Storage bucket授予资源连接服务帐号对 Cloud Storage 存储桶的访问权限
  2. Create a BigLake table using the resource connection:使用资源连接创建一个 BigLake 表
CREATE EXTERNAL TABLE `PROJECT_ID.DATASET.EXTERNAL_TABLE_NAME`
WITH CONNECTION `PROJECT_ID.REGION.CONNECTION_ID`
OPTIONS (
    format ="TABLE_FORMAT",
    uris = ['FILE_PATH']
);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Bigquery - 更新 Bigquery 外部表的 GCS 路径 - Bigquery - Updating GCS path of Bigquery external tables 无需凭据/身份验证即可从 GKE Pod 进行 GCS 读写访问 - GCS read write access from the GKE Pod without credentials/auth 允许用户在不授予 AWS 访问权限的情况下将文件加载到(并调用)Lambda function - Allow user to load file into (and invoke) Lambda function without granting AWS access Terraform 授予对 GCP Bigquery 的访问权限 - Terraform granting access on GCP Bigquery 使用 Django 存储在 GCS 上上传图像时访问被拒绝 Package - Access Denied when upload image on GCS using Django Storage Package 无法访问 Athena 外部的 Redshift 中的某些表 - Unable to access some tables in Redshift which are external from Athena GCP - Vertex AI 模型 - 访问 GCS 失败 - GCP - Vertex AI Model - Access GCS failed 允许对 GCS 存储桶进行公共读取访问? - Allow Public Read access on a GCS bucket? BigLake 表可以检测 GCS 上的架构更改吗? - Can BigLake tables detect schema changes on GCS? 如何使用 Python 在没有 GCS 中的路由的情况下获取子文件夹中的文件名? - How to get files name within subfolders without the route in GCS using Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM