简体   繁体   English

GCP 中的 Composer (Airflow) DAG RunID 冲突

[英]Composer (Airflow) DAG RunID conflict in GCP

We have one cloud function, which is cloud storage based.我们有一个基于云存储的云 function。 This cloud function will trigger once the file loaded into the bucket.一旦文件加载到存储桶中,此云 function 将触发。 When file loaded, the function will call/trigger the airflow DAG.加载文件时,function 将调用/触发 airflow DAG。 This DAG will process the file.此 DAG 将处理该文件。

The issue is, when multiple files placed same time with in a second, the function call is failing with the below error,问题是,当多个文件在一秒钟内同时放置时,function 调用失败并出现以下错误,

b'{"error":"Run id manual__2020-07-31T17:48:15+00:00 already exists for dag id pl_imaoc_trigger_dag"}\n' b'{"error":"运行 id manual__2020-07-31T17:48:15+00:00 已存在 dag id pl_imaoc_trigger_dag"}\n'

To resolve this issue we passing the run_id as 'run_id': 'IMAOC_31072020201842766625', date with milliseconds.为了解决这个问题,我们将 run_id 作为“run_id”传递:“IMAOC_31072020201842766625”,以毫秒为单位的日期。

Code:代码:

dag_name = environ_vars['imaoc_meta_dag']
    webserver_url = (
        webserver_id
        + '/api/experimental/dags/'
        + dag_name
        + '/dag_runs'
    )

    print('webserver_url: {}'.format(webserver_url))
    data['run_id'] = _datetime.datetime.now().strftime(**"IMAOC_%d%m%Y%H%M%S%f"**)
    resp = map_iap_request(webserver_url,client_id,method = 'POST',json = data)
    print('response text:{}'.format(resp))

But still it's not resolved, and AIRFLOW_CTX_DAG_RUN_ID is coming as "manual__2020-07-31T20:18:43+00:00" format....但仍然没有解决,AIRFLOW_CTX_DAG_RUN_ID 以“manual__2020-07-31T20:18:43+00:00”格式出现......

No idea what to do for remove this conflict and trigger the DAG, if the file coming on the same second.如果文件在同一秒出现,不知道该怎么做才能消除此冲突并触发 DAG。

please use the below code it working请使用下面的代码它工作

client_id = os.getenv("CLIENT_ID")
# This should be part of your webserver's URL:
# {tenant-project-id}.appspot.com
webserver_id = os.getenv("TENANT_PROJECT")
# The name of the DAG you wish to trigger
dag_name = os.getenv("DAG_NAME")
webserver_url = (
    'https://'
    + webserver_id
    + '.appspot.com/api/experimental/dags/'
    + dag_name
    + '/dag_runs'
)
# Make a POST request to IAP which then Triggers the DAG
run_id = datetime.utcnow().strftime('alpaca_%Y-%m-%dT%H:%M:%S.%f')

conf = {"conf": data}
print(f"JSON body = {conf}")

make_iap_request(
    webserver_url, client_id, method='POST', json={"conf": data, "run_id": run_id, "replace_microseconds": False})

The above answer worked for me by adding "replace_microseconds": False in the conf file as shown below通过在 conf 文件中添加"replace_microseconds": False ,上述答案对我有用,如下所示

run_id = 'trig__'+datetime.datetime.utcnow().isoformat()
conf['replace_microseconds'] = False
response = requests.post(URL, headers=Header, json={"conf": conf, "run_id": run_id})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM