简体   繁体   English

在Terraform中使用python加载csv数据到bigquery

[英]Loading csv data to bigquery using python in Terraform

Read csv file and load it to bigquery through dataflow job - use python coding for this instead of templates How perform this task using terraform(GCP) anyone help读取 csv 文件并通过数据流作业将其加载到 bigquery - 为此使用 python 编码而不是模板如何使用 terraform(GCP)执行此任务任何帮助

I trying to do it but not understanding what terraform script should I write for it我尝试这样做但不明白我应该为它编写什么 terraform 脚本

It's not the responsability of Terraform to deploy a Dataflow job.部署Dataflow作业不是Terraform的责任。

There is only a Terraform resource to instantiate a Dataflow template只有Terraform资源实例化一个Dataflow 模板

You can deleguate this to your CI CD.您可以将其委托给您的 CI CD。

Example with Beam Python : Beam Python

  • Develop the Job with Beam PythonBeam开发 Job Python
  • Through your CI CD, deploy the Python Beam code to a Cloud Storage bucket通过您的 CI CD,将Python Beam代码部署到Cloud Storage
  • Run the Dataflow job and main file with Python command line Python Dataflow行运行数据流作业和主文件

Example with Beam Java and mvn compile :使用Beam Javamvn compile的示例:

  • Develop the Job with Beam Java and Maven or Gradle使用Beam开发工作Java和 Maven 或 Gradle
  • Through your CI CD, run mvn compile command to execute the Dataflow job通过您的 CI CD,运行mvn compile命令来执行Dataflow作业

Example with Beam Java and a fat jar : Beam Javafat jar的示例:

  • Develop the Job with Beam Java and Maven or Gradle使用Beam开发工作Java和 Maven 或 Gradle
  • Through your CI CD, generate a fat jar通过你的 CI CD,生成一个fat jar
  • Deploy this fat jar to a Cloud Storage bucket将这个fat jar部署到Cloud Storage
  • Run the Dataflow job and the Main inside the fat jar with java -jar command使用java -jar命令在 fat jar 中运行Dataflow作业和Main

Example with Beam Python and Airflow / Cloud Composer :使用Beam PythonAirflow / Cloud Composer的示例:

  • Develop the Job with Beam PythonBeam开发 Job Python
  • Through your CI CD, deploy the Python Beam code to the Cloud Composer bucket with gcloud composer通过您的 CI CD,使用 gcloud gcloud composerPython Beam代码部署到Cloud Composer存储桶
  • In the Airflow code, uses BeamRunPythonPipelineOperator to instantiate the Dataflow jobAirflow代码中,使用BeamRunPythonPipelineOperator实例化Dataflow作业
  • Run the Airflow DAG to run the Dataflow job运行Airflow DAG以运行Dataflow作业

Example with Beam Java and Airflow / Cloud Composer : Beam JavaAirflow / Cloud Composer示例:

  • Develop the Job with Beam JavaBeam开发 Job Java
  • Through your CI CD, generate a fat jar通过你的 CI CD,生成一个fat jar
  • Deploy this fat jar to a Cloud Storage bucket将这个fat jar部署到Cloud Storage
  • In the Airflow code, uses BeamRunJavaPipelineOperator to instantiate the Dataflow job targeting on the path of the fat jarAirflow代码中,使用BeamRunJavaPipelineOperator实例化Dataflow作业,目标指向fat jar的路径
  • Run the Airflow DAG to run the Dataflow job运行Airflow DAG以运行Dataflow作业

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM