Kubernetes基础 ( 8 ) - 调度器

一、概述

k8s集群后面的资源是一个由许多Node节点组成的大的资源池,每个Node的硬件配置可能不完全相同,比如有些Node内存大,有些CPU多,有些是SSD等等。而Pod则会运行在这样一个资源池里,但每个Pod对资源的要求是不一样的,怎么样合理的分配资源池的资源则是一个很重要的问题。调度器解决的正是Pod该运行在哪一个Node里的问题。

In Kubernetes, scheduling refers to making sure that Pods are matched to Nodes so that Kubelet can run them.

1.1 流程说明

这个网上找的一个创建Pod的基本流程,调度器会为Pod选择合适的Node然后告诉给APIServer,如果找不到则处于Pending状态,直到遇到合适的资源。

1.2 调度目标

1.3 调度步骤

为了给Pod找到合适的Node对象,可以分为2个步骤:

有点找对象的意思,房子、车子是硬性要求,还得善良、大方、长得漂亮。为了实现满足这些,k8s实现了一系列的具体调度策略。

二、调度策略

2.1 预选策略

2.2 优选策略

三、node调度

上面是理论层面的一些信息,在实际运行中可以通过配置的方式来实现对调度策略的干预。比如Reids部署到内存型的机器上,有IP限制的服务部署到指定的服务节点上,有以下三种方式实现对pod调度到node的干预,解决的是pod与node之间的关系。

3.1 nodeName

$ kubectl get node
NAME             STATUS   ROLES    AGE   VERSION
docker-desktop   Ready    master   11d   v1.19.3

这里可以看到node的名称, 可以直接根据指定名称来确定pod可以运行在哪个节点上,如果找不到指定的节点pod将处于pengding状态。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-go-demo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: k8s-go-demo
  template:
    metadata:
      labels:
        app: k8s-go-demo
    spec:
      nodeName: docker-desktop
      containers:
      - name: k8s-go-demo
        image: pengbotao/k8s-go-demo
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 38001

3.2 nodeSelector

比nodeName灵活一点,pod调度时可以根据label进行筛选,相当于有了分组的概念,前提需要先给node打上对应的label,打标签方式如下(pod与node相同):

Examples:
  # Update pod 'foo' with the label 'unhealthy' and the value 'true'.
  kubectl label pods foo unhealthy=true

  # Update pod 'foo' with the label 'status' and the value 'unhealthy', overwriting any existing value.
  kubectl label --overwrite pods foo status=unhealthy

  # Update all pods in the namespace
  kubectl label pods --all status=unhealthy

  # Update a pod identified by the type and name in "pod.json"
  kubectl label -f pod.json status=unhealthy

  # Update pod 'foo' only if the resource is unchanged from version 1.
  kubectl label pods foo status=unhealthy --resource-version=1

  # Update pod 'foo' by removing a label named 'bar' if it exists.
  # Does not require the --overwrite flag.
  kubectl label pods foo bar-

然后就可以通过指定pod.spec.nodeSelector来指定pod想运行在的节点,比如本地这里就会报不匹配:

Warning FailedScheduling 9s default-scheduler 0/1 nodes are available: 1 node(s) didn’t match node selector.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-go-demo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: k8s-go-demo
  template:
    metadata:
      labels:
        app: k8s-go-demo
    spec:
      nodeSelector:
        env: sandbox
      containers:
      - name: k8s-go-demo
        image: pengbotao/k8s-go-demo
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 38001

3.1 nodeAffinity

nodeSelector是白名单操作方式,但往往还有一些组合方式不足以支撑,这个时候就可以使用nodeAffinity来实现。从策略上可以分为硬策略(required)必须满足否则pending和软策略(preferred)优先满足但不保证。

策略 类型 说明
requiredDuringSchedulingIgnoredDuringExecution required 必须满足,否则pending,IgnoredDuringExecution表示运行中状态标签若变化可忽略,让pod仍然继续运行
preferredDuringSchedulingIgnoredDuringExecution preferred 优先但不保证

还有两种但貌似还没实现,文档里查不到,和前面的区别就是当标签变化不在满足条件时则重新选择符合的节点:

示例:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-go-demo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: k8s-go-demo
  template:
    metadata:
      labels:
        app: k8s-go-demo
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: env
                operator: In
                values: 
                - sandbox
                - test
      containers:
      - name: k8s-go-demo
        image: pengbotao/k8s-go-demo
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 38001

表示必须是打了env标签为sandbox或者test的节点,本地默认还没打,程序处于pengding,打上标签后运行成功。

$ kubectl label node docker-desktop env=sandbox

其中operator有:In、NotIn、Exists、DoesNotExist、Gt、Lt ,比如要指定打了env=sandbox标签或者没有打过标签的:

    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: env
                operator: In
                values: 
                - sandbox
            - matchExpressions:
              - key: env
                operator: DoesNotExist

nodeSelectorTerms下的多个条件满足一个就行,matchExpressions下的需要满足所有条件才行。如果是指定preferredDuringSchedulingRequiredDuringExecution则可以这么使用:

    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1 #权重,取值范围为 1-100
            preference:
              matchExpressions:
              - key: env
                operator: In
                values: 
                - prod

四、pod调度

node亲和性是根据node的标签给pod找node,而pod亲和性则更细一些,考虑的是pod与其他pod之间的亲和性。

我们可以对Node打标签,系统也会预设一些标签:

标签 示例
failure-domain.beta.kubernetes.io/region cn-hangzhou
failure-domain.beta.kubernetes.io/zone cn-hangzhou-h
kubernetes.io/arch amd64
kubernetes.io/hostname cn-hangzhou.192.168.0.100
kubernetes.io/os linux
beta.kubernetes.io/instance-type ecs.c6.4xlarge

云厂商也会打一些标签,比如:

标签 示例
alibabacloud.com/nodepool-id
topology.diskplugin.csi.alibabacloud.com/zone cn-hangzhou-h
topology.kubernetes.io/region cn-hangzhou
topology.kubernetes.io/zone cn-hangzhou-h

运行在一起的意思时可以根据指定某些标签比如都是linux系统,都在某一区域等,可以通过topologyKey来指定。

用法同node类似,通过label选择时支持的操作符有:In、NotIn、Exists、DoesNotExist

4.1 pod亲和性

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-go-demo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: k8s-go-demo
  template:
    metadata:
      labels:
        app: k8s-go-demo
    spec:
      affinity:
        podAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 50
            podAffinityTerm:
              topologyKey: kubernetes.io/hostname
              labelSelector:
                matchExpressions:
                - key: service
                  operator: In
                  values: 
                  - order
      containers:
      - name: k8s-go-demo
        image: pengbotao/k8s-go-demo
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 38001

也可以使用requiredDuringSchedulingIgnoredDuringExecution

    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: service
                operator: In
                values: 
                - order
            topologyKey: kubernetes.io/hostname

通过topologyKey定义位置划分方式,上面表示想要或者必须和运行了service=order的pod运行在一起,通过hostname标签来区分,不过往往hostname都是唯一的,也就是得运行在同一个节点。

pod的亲和性属于比较细的场景了,topologyKey设置自定义的标签留待以后碰到了在测试。

4.2 pod反亲和性

定义在字段pod.spec.affinity.podAntiAffinity,使用方式和上面是一样,只是逻辑是反的。

五、污点与容忍

污点(taints)与容忍(tolerations)也是用来定义Pod与Node的调度关系,污点打在Node上,容忍设置在Pod上。当Node被打上污点时按字面理解就是该Node有缺陷,如果Pod可以容忍该缺陷才有可能调度到该Node上。

5.1 定义污点

定义污点
$ kubectl taint nodes [node] key=value[effect]
删除污点,和Label类似加个减号
$ kubectl taint nodes [node] key=value[effect]-
查看污点
$ kubectl describe node docker-desktop | grep Taints

value可以省略,其中effect的可选值有: [ NoSchedule | PreferNoSchedule | NoExecute ]

示例:

$ kubectl taint nodes docker-desktop env=test:NoSchedule
node/docker-desktop tainted
$ kubectl describe node docker-desktop | grep Taints
Taints:             env=test:NoSchedule
$ kubectl taint nodes docker-desktop env-
node/docker-desktop untainted

5.2 定义容忍

当设置污点NoSchedule,但不设置容忍时Pod会处于Pengding并提示:

default-scheduler 0/1 nodes are available: 1 node(s) had taint {env: test}, that the pod didn’t tolerate.

容忍设置方式

$ kubectl describe node docker-desktop | grep Taints
Taints:             env=test:NoSchedule

$ kubectl apply -f k8s-go-demo.yaml
$ cat k8s-go-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-go-demo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: k8s-go-demo
  template:
    metadata:
      labels:
        app: k8s-go-demo
    spec:
      tolerations:
      - key: "env"
        value: "test"
        operator: "Equal"
        effect: "NoSchedule"
      containers:
      - name: k8s-go-demo
        image: pengbotao/k8s-go-demo
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 38001

-- EOF --
最后更新于: 2021-09-15 08:27
发表于: 2020-10-30 17:46
标签: Kubernetes 容器化