简体   繁体   English

Kubernetes 以负载平衡的方式调度 GPU-pod

[英]Kubernetes scheduling GPU-pods in loadbalanced manner

There is this kubernetes cluster with n number of nodes where some of the nodes are fitted with multiple NVIDIA 1080Ti GPU cards on it.这个 kubernetes 集群有 n 个节点,其中一些节点上装有多个 NVIDIA 1080Ti GPU 卡。

I have two kind of pods 1. GPU enabled, these need to be scheduled on GPU fitted nodes where pod will only use one of the GPU cards present on that node.我有两种 pod 1. 启用 GPU,这些需要在安装了 GPU 的节点上进行调度,其中 pod 将仅使用该节点上存在的 GPU 卡之一。 2. CPU only, now these can be scheduled anywhere, preferably on CPU only nodes. 2. CPU only,现在这些可以在任何地方调度,最好在只有 CPU 的节点上。

Scheduling problem is addressed clearly in this answer. 这个答案清楚地解决了调度问题。

Issue: When scheduling a GPU-enabled pod on a GPU fitted node I want to be able decide on which GPU card among those multiple GPU cards my pod is going to use.问题:在安装了 GPU 的节点上调度支持 GPU 的 pod 时,我希望能够决定我的 pod 将使用这些多张 GPU 卡中的哪一张 GPU 卡。 Further, I was thinking of a loadbalancer sitting transparently b/w GPU hardware and pods that will decide the mapping.此外,我正在考虑一个负载均衡器,它可以透明地设置在黑白 GPU 硬件和 pod 中,它们将决定映射。

Any help around this architecture would be deeply appreciated.对此架构的任何帮助将不胜感激。 Thank you!谢谢!

You have to use Official NVIDIA GPU device plugin rather than suggested by GCE.您必须使用官方 NVIDIA GPU 设备插件,而不是 GCE 建议的。 There's possibility to schedule GPUs by attributes有可能按属性安排 GPU

Pods can specify device selectors based on the attributes that are advertised on the node. Pod 可以根据节点上公布的属性指定设备选择器。 These can be specified at the container level.这些可以在容器级别指定。 For example:例如:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
    - name: cuda-container
      image: nvidia/cuda:9.0-base
      command: ["sleep"]
      args: ["100000"]
      computeResourceRequests: ["nvidia-gpu"]
  computeResources:
    - name: "nvidia-gpu"
      resources:
        limits:
          nvidia.com/gpu: 1
      affinity:
        required:
          - key: "nvidia.com/gpu-memory"
            operator: "Gt"
            values: ["8000"] # change value to appropriate mem for GPU

Check Kubernetes on NVIDIA GPUs Installation Guide检查 NVIDIA GPU 上的 Kubernetes 安装指南

Hope this will help希望这会有所帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM