[英]When using Google Cloud Dataflow flex templates, is it possible to use a multi-command CLI to run a job?
I've read through the Google Cloud Dataflow (python SDK v2.40) documents on creating a Flex Template for a single job.我已经阅读了有关为单个作业创建 Flex 模板的 Google Cloud Dataflow (python SDK v2.40) 文档。 All examples and documentation I have found map a single python file to a single pipeline.
我发现的所有示例和文档 map 单个 python 文件到单个管道。 However, I'd like to include a single python file that I can use to encapsulate multiple pipelines to allow a more modular and documentable set of pipelines.
但是,我想包含一个 python 文件,我可以使用它来封装多个管道,以允许更模块化和可记录的管道集。 I'd like to minimize the number of separate images I create to run multiple pipelines.
我想尽量减少为运行多个管道而创建的单独图像的数量。 One approach that I would normally use is to have a multi-command command line interface like:
我通常使用的一种方法是拥有一个多命令命令行界面,例如:
pipeline_script.py pipeline1
for pipeline 1.对于管道 1。
pipeline_script.py pipeline2
for pipeline 2, and so on.对于管道 2,依此类推。
I see a single environment variable, FLEX_TEMPLATE_PYTHON_PY_OPTIONS
, that might be useful, but the documentation is not clear on how it could help in my use case.我看到一个环境变量
FLEX_TEMPLATE_PYTHON_PY_OPTIONS
可能有用,但文档不清楚它如何在我的用例中提供帮助。
In summary, I have multiple dataflow pipelines that I'd like to run from a single Flex Template image.总之,我有多个数据流管道,我想从单个 Flex 模板图像运行它们。 Any pointers?
任何指针?
I don't believe there's a hard restriction to limit a given image to a flex template.我不认为将给定图像限制为 flex 模板有严格的限制。 But each image maps to a given main file as mentioned here .
但是每个图像都映射到一个给定的主文件,如此处所述。
So, if you are able to update your main file to run different pipelines based on metadata provided, you might be able to use the same image for multiple pipelines.因此,如果您能够根据提供的元数据更新主文件以运行不同的管道,则您可能能够为多个管道使用相同的图像。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.