简体   繁体   中英

Google Cloud Dataflow Python --maxNumWorkers

I am trying to increase the number of workers for a dataflow pipeline built with the Apache Beam's python SDK and I found documentation that suggested setting --maxNumWorkers= flag would be sufficient to increase the maximum number of workers beyond the default value of 15. However, when I add this flag to the pipeline options, it does not seem to work. I looked back at the execution parameter options documented here and noticed that the maxNumWorkers was not listed in the Python "Specifying Other Cloud Pipeline Options" while it does appear in the Java SDK -- is this a known limitation of the Python package? Are there any other options I am not considering to set the maxNumWorkers within the python pipeline options?

Note: I have confirmed that this is not a quota issue as I can specify --num_workers=100 but this (I believe) will not use the autoscaling algorithm as it immediately set the number of workers to 100.

看起来在Python中,选项为--max_num_workers

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM