简体   繁体   English

Google Cloud Dataflow Python --maxNumWorkers

[英]Google Cloud Dataflow Python --maxNumWorkers

I am trying to increase the number of workers for a dataflow pipeline built with the Apache Beam's python SDK and I found documentation that suggested setting --maxNumWorkers= flag would be sufficient to increase the maximum number of workers beyond the default value of 15. However, when I add this flag to the pipeline options, it does not seem to work. 我试图增加使用Apache Beam的python SDK构建的数据流管道的工作程序数量,我发现文档建议设置--maxNumWorkers=标志足以增加最大工作程序数量,使其超过默认值15。 ,当我将此标志添加到管道选项时,它似乎不起作用。 I looked back at the execution parameter options documented here and noticed that the maxNumWorkers was not listed in the Python "Specifying Other Cloud Pipeline Options" while it does appear in the Java SDK -- is this a known limitation of the Python package? 我回头看了这里记录的执行参数选项,发现maxNumWorkers没有出现在Python“指定其他云管道选项”中,而确实出现在Java SDK中-这是Python包的已知限制吗? Are there any other options I am not considering to set the maxNumWorkers within the python pipeline options? 我是否还在考虑在python管道选项内设置maxNumWorkers的其他选项?

Note: I have confirmed that this is not a quota issue as I can specify --num_workers=100 but this (I believe) will not use the autoscaling algorithm as it immediately set the number of workers to 100. 注意:我已经确认这不是配额问题,因为我可以指定--num_workers=100但这(我相信)将不使用自动缩放算法,因为它将立即将工作人数设置为100。

看起来在Python中,选项为--max_num_workers

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM