简体   繁体   English

使用 Google Cloud Dataflow flex 模板时,是否可以使用多命令 CLI 来运行作业?

[英]When using Google Cloud Dataflow flex templates, is it possible to use a multi-command CLI to run a job?

I've read through the Google Cloud Dataflow (python SDK v2.40) documents on creating a Flex Template for a single job.我已经阅读了有关为单个作业创建 Flex 模板的 Google Cloud Dataflow (python SDK v2.40) 文档。 All examples and documentation I have found map a single python file to a single pipeline.我发现的所有示例和文档 map 单个 python 文件到单个管道。 However, I'd like to include a single python file that I can use to encapsulate multiple pipelines to allow a more modular and documentable set of pipelines.但是,我想包含一个 python 文件,我可以使用它来封装多个管道,以允许更模块化和可记录的管道集。 I'd like to minimize the number of separate images I create to run multiple pipelines.我想尽量减少为运行多个管道而创建的单独图像的数量。 One approach that I would normally use is to have a multi-command command line interface like:我通常使用的一种方法是拥有一个多命令命令行界面,例如:

pipeline_script.py pipeline1

for pipeline 1.对于管道 1。

pipeline_script.py pipeline2

for pipeline 2, and so on.对于管道 2,依此类推。

I see a single environment variable, FLEX_TEMPLATE_PYTHON_PY_OPTIONS , that might be useful, but the documentation is not clear on how it could help in my use case.我看到一个环境变量FLEX_TEMPLATE_PYTHON_PY_OPTIONS可能有用,但文档不清楚它如何在我的用例中提供帮助。

In summary, I have multiple dataflow pipelines that I'd like to run from a single Flex Template image.总之,我有多个数据流管道,我想从单个 Flex 模板图像运行它们。 Any pointers?任何指针?

I don't believe there's a hard restriction to limit a given image to a flex template.我不认为将给定图像限制为 flex 模板有严格的限制。 But each image maps to a given main file as mentioned here .但是每个图像都映射到一个给定的主文件,如此所述。

So, if you are able to update your main file to run different pipelines based on metadata provided, you might be able to use the same image for multiple pipelines.因此,如果您能够根据提供的元数据更新主文件以运行不同的管道,则您可能能够为多个管道使用相同的图像。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 何时使用 Google App Engine Flex 与 Google Cloud Run - When to use Google App Engine Flex vs Google Cloud Run 如何使用自定义 Docker 图像运行 Python Google Cloud Dataflow 作业? - How to run a Python Google Cloud Dataflow job with a custom Docker image? 无法在 Google Cloud Dataflow 虚拟机中使用 ping 命令? - Can't use ping command in Google Cloud Dataflow vm? 使用模板部署时是否可以更新现有的 Google 云数据流管道? - Is it possible to update existing Google cloud dataflow pipelines when using template for deployment ? 是否可以在云数据流谷歌云平台中使用 apache 光束执行存储过程 MySQL Azure? - Is possible to execute Stored Procedure MySQL Azure using apache beam in cloud dataflow google cloud platform? 作业图太大,无法提交到 Google Cloud Dataflow - Job graph too large to submit to Google Cloud Dataflow Cloud Dataflow 作业的调度 - Schedulling for Cloud Dataflow Job Google Cloud SDK 创建 Cloud Run Job - Google Cloud SDK to create a Cloud Run Job 一旦使用 apache 光束 sdk 在 Google Cloud 中创建数据流作业,我们可以从云存储桶中删除 tmp 文件吗? - Once dataflow job is created in Google Cloud using apache beam sdk, can we delete the tmp files from cloud storage bucket? Google Cloud Dataflow 可以在 Go 中没有外部 IP 地址的情况下运行吗? - Can Google Cloud Dataflow be run without an external IP address in Go?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM