简体   繁体   中英

GPU Workload with Composer 2 and GKE Autopilot?

we have the latest Version of Composer 2:
composer-2.0.28-airflow-2.3.3

Our GKE Version is:
1.22.12-gke.2300

We want to deploy GPU Workloads within Composer 2.

We tried as documented here

apiVersion: v1
kind: Pod
metadata:
  name: my-gpu-pod
spec:
  nodeSelector:
    cloud.google.com/gke-accelerator: nvidia-tesla-t4
  containers:
  - name: my-gpu-container
    image: nvidia/cuda:11.0.3-runtime-ubuntu20.04
    command: ["/bin/bash", "-c", "--"]
    args: ["while true; do sleep 600; done;"]
    resources:
      limits:
        nvidia.com/gpu: 1

but it seems the examples don't work for us.

Error message is:
Autopilot doesn't support GPUs yet.

The documentation says:
"Ensure that you have a GKE Autopilot cluster running GKE version 1.24.2-gke.1800 or later."

Does this mean that you can't yet use GPU workloads with the current version of composer 2?

Or are we meant to go the way with GKECreateClusterOperator and setting up separate special GPU nodepool?

Thanks in advance for any help

In the "Before You Begin" section on the Autopilot GPU docs :

Ensure that you have a GKE Autopilot cluster running GKE version 1.24.2-gke.1800 or later.

For me that meant creating the cluster using the --release-channel=rapid flag. I ran into an issue trying to upgrade the cluster in place and just decided to chuck it, but there is probably a path to upgrade them in place

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM