简体   繁体   English

动态气流EMR连接

[英]Dynamic Airflow EMR Connection

I have an Airflow DAG which creates an EMR cluster, then runs SSHOperator tasks on that cluster. 我有一个创建EMR集群的Airflow DAG,然后在该集群上运行SSHOperator任务。 Right now, I am hard coding the Master public DNS for the EMR cluster into an Airflow SSH connection. 现在,我正在将EMR集群的主公共DNS硬编码为Airflow SSH连接。 Is there a way for my DAG to dynamically populate this DNS when the EMR cluster is created so I don't have to manually update the connection? 有没有办法让我的DAG在创建EMR集群时动态填充此DNS,这样我就不必手动更新连接了?

After a bit more digging into the Airflow CLI I found it is possible to create/ delete new connections. 在进一步深入了解Airflow CLI之后,我发现可以创建/删除新连接。 I've added a bash operator after building the EMR cluster to add an Airflow connection. 我在构建EMR集群后添加了一个bash运算符来添加Airflow连接。

airflow connections --delete --conn_id aws_emr

airflow connections --add --conn_id aws_emr --conn_type SSH --conn_host publicDNS --conn_login username --conn_extra {"key_file":"file.pem"}

You can use airflow xcom variables to pass value from one tasks to another tasks. 您可以使用气流xcom变量将值从一个任务传递到另一个任务。 In your usecase you can pass EMR DNS value from EMR creation task to SSH task via XCOM variable. 在您的用例中,您可以通过XCOM变量将EMR DNS值从EMR创建任务传递到SSH任务。

Airflow Xcom concepts Airflow Xcom概念

Pushing data to xcom: 将数据推送到xcom:

context['ti'].xcom_push(key="xcom_key", value="DNS_NAME")

pulling data from xcom: 从xcom中提取数据:

context['ti'].xcom_pull(key="xcom_key", task_ids="EMR_Task")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM