簡體   English   中英

來自作曲家錯誤的 gcp 觸發數據流作業

[英]gcp trigger dataflow job from composer error

我正在嘗試使用以下代碼從 Composer Airflow DAG 運行數據流作業。

根據代碼,我收到兩種類型的錯誤消息。

請建議如何修復它。

a) 錯誤一:服務賬號eMail時被注釋(#)

# "serviceAccountEmail": "service-7276363xxxxx@cloudcomposer-accounts.iam.gserviceaccount.com",

錯誤:

Error: Required 'compute.subnetworks.get' permission for 'projects/vpc-host/regions/us-central1/subnetworks/sbn-dataflow'

b) 錯誤2:當服務賬號eMail被使用時

"serviceAccountEmail": "service-7276363xxxxx@cloudcomposer-accounts.iam.gserviceaccount.com",

錯誤:

Current user cannot act as service account service-7276363xxxxx@cloudcomposer-accounts.iam.gserviceaccount.com

代碼:

import datetime

from airflow import models
from airflow.contrib.operators.dataflow_operator import DataflowTemplateOperator
from airflow.utils.dates import days_ago

bucket_path = models.Variable.get("bucket_path")
project_id = models.Variable.get("project_id")
gce_zone = models.Variable.get("gce_zone")


default_args = {
    "owner": "Airflow",
    "start_date": days_ago(1),
    "depends_on_past": False,
    "dataflow_default_options": {
        "project": project_id,
        "zone": gce_zone,
        "serviceAccountEmail": "service-7276363xxxxx@cloudcomposer-accounts.iam.gserviceaccount.com",
        "subnetwork": "https://www.googleapis.com/compute/v1/projects/vpc-host/regions/us-central1/subnetworks/sbn-dataflow",
        "tempLocation": bucket_path + "/tmp/",
    }
}


with models.DAG(
    dag_id="composer_dataflow_dag",
    default_args=default_args,
    schedule_interval=datetime.timedelta(days=1)
) as dag:
    dataflow_template_job = DataflowTemplateOperator(
        task_id="dataflow_csv_to_bq",
        template="gs://dataflow-templates/latest/GCS_Text_to_BigQuery",
        parameters={
            "javascriptTextTransformFunctionName": "transformCSVtoJSON",
            "javascriptTextTransformGcsPath": bucket_path + "/SCORE_STG.js",
            "JSONPath": bucket_path + "/SCORE_STG.json",
            "inputFilePattern": bucket_path + "/stg_data.csv",
            "outputTable": project_id + ":gcp_stage.SCORE_STG",
            "bigQueryLoadingTemporaryDirectory": bucket_path + "/tmp/",
        },
        dag=dag,
    )

您必須使用不同的服務帳戶。請記住,它必須具有對資源的訪問權限。 這應該可以解決這兩個問題。

您可以創建一個服務帳戶來充當工作人員,如角色分配中所述。 即:一名工人和一名管理員。

除此之外,我沒有發現任何不正常的東西。 連參數都正確傳遞了。 供其他用戶參考:

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM