为什么gcloud ml-engine Submit命令给出“请求的cpu超出配额”？

Question

I am running a tensorflow object detection job on GCP with the folowing command: 我正在使用以下命令在GCP上运行一个tensorflow对象检测作业：

gcloud ml-engine jobs submit training whoami _object_detection_ date +%s --job-dir=gs://${YOUR_GCS_BUCKET}/train --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz --module-name object_detection.model_tpu_main --runtime-version 1.9 --scale-tier BASIC_TPU --region us-central1 -- --model_dir=gs://${YOUR_GCS_BUCKET}/train --tpu_zone us-central1 --pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/pinches_pipeline.config gcloud ml-engine作业提交培训whoami _object_detection_ date +%s --job-dir = gs：// $ {YOUR_GCS_BUCKET} / train --package dist / object_detection-0.1.tar.gz，slim / dist / slim-0.1。 tar.gz，/ tmp / pycocotools / pycocotools-2.0.tar.gz --module-name object_detection.model_tpu_main --runtime-version 1.9 --scale-tier BASIC_TPU --region us-central1---model_dir = gs： // $ {YOUR_GCS_BUCKET} / train --tpu_zone us-central1 --pipeline_config_path = gs：// $ {YOUR_GCS_BUCKET} /data/pinches_pipeline.config

Got the following error: 出现以下错误：

ERROR: (gcloud.ml-engine.jobs.submit.training) RESOURCE_EXHAUSTED: Quota failure for project seal-pinches. 错误：（gcloud.ml-engine.jobs.submit.training）RESOURCE_EXHAUSTED：项目失败的配额失败。 The requested 54.0 CPUs exceeds the allowed maximum of 20.0. 请求的54.0 CPU超过了允许的最大值20.0。 To read more about Cloud ML Engine quota, see https://cloud.google.com/ml-engine/quotas . 要了解有关Cloud ML Engine配额的更多信息，请参阅https://cloud.google.com/ml-engine/quotas 。 - '@type': type.googleapis.com/google.rpc.QuotaFailure violations: - description: The requested 54.0 CPUs exceeds the allowed maximum of 20.0. -'@type'：type.googleapis.com/google.rpc.Quota违规行为：-说明：请求的54.0 CPU超过了允许的最大值20.0。

My question is how the requested CPU getting set to 54? 我的问题是请求的CPU如何设置为54？ I am not setting this anywhere explicitly. 我没有在任何地方明确设置此设置。

Thanks in advance. 提前致谢。

Answer 1

This option in your code is setting the size and type of your ml instance: 您代码中的此选项用于设置ml实例的大小和类型：

--scale-tier BASIC_TPU

The BASIC_TPU costs $6.8474 per hour. BASIC_TPU每小时收费$ 6.8474。 I am not sure of the formula, but a Cloud TPU translates into N CPUs in equivalent billing. 我不确定这个公式，但是Cloud TPU可以等效的计费转换为N个CPU。 You also need to add the cost of the Cloud ML Engine machine type to your cost: standard is $0.2774 per hour. 您还需要将Cloud ML Engine机器类型的成本添加到成本中：标准为每小时0.2774美元。

Google's description: Google的说明：

Quota is defined in terms of Cloud TPU cores. 配额是根据Cloud TPU核心定义的。 A single Cloud TPU device comprises 4 TPU chips and 8 cores: 2 cores per TPU chip. 单个Cloud TPU设备包含4个TPU芯片和8个核心：每个TPU芯片2个核心。 A Cloud TPU v2 Pod (alpha) consists of 64 TPU devices containing 256 TPU chips (512 cores). Cloud TPU v2 Pod（alpha）由包含256个TPU芯片（512个内核）的64个TPU设备组成。 The number of cores also specifies the quota for a particular Cloud TPU. 内核数还指定了特定Cloud TPU的配额。 For example, a quota of 8 enables the use of 8 cores. 例如，配额8允许使用8个内核。 A quota of 16 enables use of up to 16 cores, and so forth. 配额16允许使用多达16个内核，依此类推。

Your CPU quota is 20. You will need to increase your quota or choose a different model such as BASIC or BASIC_GPU which does not use TPUs. 您的CPU配额为20。您将需要增加配额或选择不使用TPU的其他模型，例如BASIC或BASIC_GPU 。 Also double check that you have billing setup with a credit / debit card with sufficient credit available. 还要仔细检查您是否已使用具有足够信用额度的信用卡/借记卡进行了结算设置。

为什么gcloud ml-engine Submit命令给出“请求的cpu超出配额”？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-02-20 21:31:22

为什么gcloud ml-engine Submit命令给出“请求的cpu超出配额”？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-02-20 21:31:22

解决方案1
0 已采纳 2019-02-20 21:31:22