简体   繁体   English

每个 kube.netes 节点一个 pod/job

[英]One pod/job per kubernetes node

I would like to schedule periodically kube.netes jobs (with different images)我想定期安排 kube.netes 作业(使用不同的图像)

These jobs are required to run on a node with GPU support (1 GPU device)这些作业需要在支持 GPU 的节点上运行(1 GPU 设备)

Currently If I create two jobs at the same time - the pods will be scheduled both on the same node - while only one pod will have access to GPU device目前,如果我同时创建两个作业——pod 将被安排在同一个节点上——而只有一个 pod 可以访问 GPU 设备

Is there a way to configure nodes/pods so that scheduler only places one pod per node and once it is completed places next job?有没有一种方法可以配置节点/pods,以便调度程序只为每个节点放置一个 pod,一旦完成就放置下一个作业?

You could set an inter-pod anti-affinity as described in the docs here .您可以按照此处文档中的描述设置 Pod 间反亲和性。

Inter-pod affinity and anti-affinity rules take the form "this Pod should (or, in the case of anti-affinity, should not) run in an X if that X is already running one or more Pods that meet rule Y", where X is a topology domain like node, rack, cloud provider zone or region, or similar and Y is the rule Kubernetes tries to satisfy. Pod 间亲和力和反亲和力规则采取的形式是“如果 X 已经在运行一个或多个满足规则 Y 的 Pod,则此 Pod 应该(或者,在反亲和力的情况下,不应该)在 X 中运行”,其中 X 是拓扑域,如节点、机架、云提供商区域或区域等,Y 是 Kubernetes 试图满足的规则。

Similar to node affinity are two types of Pod affinity and anti-affinity as follows:与node affinity类似的还有Pod affinity和anti-affinity两种类型,如下:

  • requiredDuringSchedulingIgnoredDuringExecution requiredDuringSchedulingIgnoredDuringExecution
  • preferredDuringSchedulingIgnoredDuringExecution preferredDuringSchedulingIgnoredDuringExecution

Consider the following Pod spec:考虑以下 Pod 规范:

 apiVersion: v1 kind: Pod metadata: name: with-pod-affinity spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: security operator: In values: - S1 topologyKey: topology.kubernetes.io/zone podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: security operator: In values: - S2 topologyKey: topology.kubernetes.io/zone containers: - name: with-pod-affinity image: registry.k8s.io/pause:2.0

Since you are preferring to use a simple way you can use nodeSelector for selecting your desired node using the node labels and use queueSort for scheduling pods one after another.由于您更喜欢使用一种简单的方法,因此您可以使用 nodeSelector 使用节点标签选择所需的节点,并使用 queueSort 一个接一个地调度 pod。 In short you are defining pods with certain labels to run on a certain node that too on a priority basis.简而言之,您正在定义具有特定标签的 Pod,以在优先级基础上在特定节点上运行。 This document gives you a better understanding for achieving your desired functionality.文档让您更好地理解实现所需的功能。

I ended up using suggestion from @Calum Halpin with Extended Resources for a Node我最终使用了@Calum Halpin 的建议和节点的扩展资源

https://kube.netes.io/docs/tasks/administer-cluster/extended-resource-node/ https://kube.netes.io/docs/tasks/administer-cluster/extended-resource-node/

  1. patch the node and add 1 "gpu" to capacity:修补节点并向容量添加 1 个“gpu”:
curl --header "Content-Type: application/json-patch+json" \
--request PATCH \
--data '[{"op": "add", "path": "/status/capacity/gpu", "value": "1"}]' \
http://localhost:8001/api/v1/nodes/<node-name>/status
  1. Then for your container set resource request for 1 gpu:然后为您的容器设置资源请求 1 gpu:
    resources:
      requests:
        gpu: 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM