简体   繁体   English

Google云端数据融合的权限问题

[英]Permissions Issue with Google Cloud Data Fusion

I'm following the instructions in the Cloud Data Fusion sample tutorial and everything seems to work fine, until I try to run the pipeline right at the end. 我遵循Cloud Data Fusion示例教程中的说明 ,一切似乎都运行正常,直到我尝试在最后运行管道。 Cloud Data Fusion Service API permissions are set for the Google managed Service account as per the instructions. 根据说明为Google托管服务帐户设置云数据融合服务API权限。 The pipeline preview function works without any issues. 管道预览功能无任何问题。

However, when I deploy and run the pipeline it fails after a couple of minutes. 但是,当我部署并运行管道时,它会在几分钟后失败。 Shortly after the status changes from provisioning to running the pipeline stops with the following permissions error: 状态从配置更改为运行后不久,管道将停止并出现以下权限错误:

   com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
    {
      "code" : 403,
      "errors" : [ {
        "domain" : "global",
        "message" : "xxxxxxxxxxx-compute@developer.gserviceaccount.com does not have storage.buckets.create access to project X.",
        "reason" : "forbidden"
      } ],
      "message" : "xxxxxxxxxxx-compute@developer.gserviceaccount.com does not have storage.buckets.create access to project X."
    }

xxxxxxxxxxx-compute@developer.gserviceaccount.com is the default Compute Engine service account for my project. xxxxxxxxxxx-compute@developer.gserviceaccount.com是我项目的默认计算引擎服务帐户。

"Project X" is not one of mine though, I've no idea why the pipeline startup code is trying to create a bucket there, it does successfully create temporary buckets ( one called df-xxx and one called dataproc-xxx) in my project before it fails. “Project X”不是我的一个,我不知道为什么管道启动代码试图在那里创建一个桶,它确实成功创建了临时桶(一个叫做df-xxx,一个叫做dataproc-xxx)。项目失败之前。

I've tried this with two separate accounts and get the same error in both places. 我用两个独立的帐户尝试了这个,并在两个地方都得到了同样的错误。 I had tried adding storage/admin roles to the various service accounts to no avail but that was before I realized it was attempting to access a different project entirely. 我曾尝试将存储/管理角色添加到各种服务帐户但无济于事,但那是在我意识到它试图完全访问不同的项目之前。

I believe I was able to reproduce this. 我相信我能够重现这一点。 What's happening is that the BigQuery Source plugin first creates a temporary working GCS bucket to export the data to, and I suspect it is attempting to create it in the Dataset Project ID by default, instead of your own project as it should. 发生的事情是BigQuery Source插件首先创建一个临时工作的GCS存储桶来导出数据,我怀疑它是在默认情况下尝试在数据集项目ID中创建它,而不是它应该的自己的项目。

As a workaround, create a GCS bucket in your account, and then in the BigQuery Source configuration of your pipeline, set the "Temporary Bucket Name" configuration to "gs://<your-bucket-name>" 要解决此问题,请在您的帐户中创建GCS存储桶,然后在管道的BigQuery Source配置中,将“临时存储桶名称”配置设置为“gs:// <your-bucket-name>”

You are missing setting up permissions steps after you create an instance. 创建实例后,您缺少设置权限步骤。 The instructions to give your service account right permissions is in this page https://cloud.google.com/data-fusion/docs/how-to/create-instance 有关为您的服务帐户授予权限的说明,请访问此页面https://cloud.google.com/data-fusion/docs/how-to/create-instance

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM