我有一个包含 3 个节点的 k8s 集群。当我在这个 k8s 集群中部署有状态集时,我发现了一个关于 K8S pod 反亲和力规则的问题。
这是错误消息。
Warning FailedScheduling 32s (x8 over 6m44s) default-scheduler 0/3 nodes are available: 3 node(s) didn't match pod affinity/anti-affinity, 3 node(s) didn't satisfy existing pods anti-affinity rules.
这是关于nodeAffinity和podAntiAffinity的代码。
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: dolphin.region/name
operator: In
values:
- common
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- redis-sentinel
topologyKey: kubernetes.io/hostname
节点信息。
[root@k8s-10 ~]# kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-10.67.42.10 Ready master 86d v1.13.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dolphin.region/name=common,k8s_ceph=ceph,kubernetes.io/hostname-master=kube-master,kubernetes.io/hostname=k8s-10.67.42.10,node-role.kubernetes.io/master=
k8s-10.67.42.11 Ready master 86d v1.13.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dolphin.region/name=common,k8s_ceph=ceph,kubernetes.io/hostname-master=kube-master,kubernetes.io/hostname=k8s-10.67.42.11,node-role.kubernetes.io/master=
k8s-10.67.42.12 Ready master 86d v1.13.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dolphin.region/name=common,k8s_ceph=ceph,kubernetes.io/hostname-master=kube-master,kubernetes.io/hostname=k8s-10.67.42.12,node-role.kubernetes.io/master=
只有 3 个 pod 具有标签 app.kubernetes.io/name=redis-sentinel。
[root@k8s-10 ~]# kubectl get pod -n project -l app.kubernetes.io/name=redis-sentinel -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dol-redis-sentinel-0 1/1 Running 0 18h 10.244.243.2 k8s-10.67.42.12 <none> <none>
dol-redis-sentinel-1 1/1 Running 0 16h 10.244.35.170 k8s-10.67.42.10 <none> <none>
dol-redis-sentinel-2 0/1 Pending 0 29m <none> <none> <none> <none>
节点 k8s-10.67.42.11 也没有 Taint
[root@k8s-10 ~]# kubectl describe node k8s-10.67.42.11 | grep Taint
Taints: <none>
[root@k8s-10 ~]# kubectl get pods --show-labels -n project |grep sentinel
dol-redis-sentinel-0 1/1 Running 0 18h app.kubernetes.io/instance=dol-redis-sentinel,app.kubernetes.io/name=redis-sentinel,app=dol-redis-sentinel,controller-revision-hash=dol-redis-sentinel-b4975b6bd,statefulset.kubernetes.io/pod-name=dol-redis-sentinel-0
dol-redis-sentinel-1 1/1 Running 0 16h app.kubernetes.io/instance=dol-redis-sentinel,app.kubernetes.io/name=redis-sentinel,app=dol-redis-sentinel,controller-revision-hash=dol-redis-sentinel-b4975b6bd,statefulset.kubernetes.io/pod-name=dol-redis-sentinel-1
dol-redis-sentinel-2 0/1 Pending 0 38m app.kubernetes.io/instance=dol-redis-sentinel,app.kubernetes.io/name=redis-sentinel,app=dol-redis-sentinel,controller-revision-hash=dol-redis-sentinel-b4975b6bd,statefulset.kubernetes.io/pod-name=dol-redis-sentinel-2
dol-redis-sentinel-exporter-75c8cbdc97-wztpk 1/1 Running 0 7d app.kubernetes.io/part-of=redis-sentinel,app=dol-redis-sentinel-exporter,dolphin/part-of-instance=dol-redis-sentinel,pod-template-hash=75c8cbdc97
kubenetes服务器版本为v1.13.3
[root@k8s-10 ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:08:12Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:00:57Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
如何将 pod 调度到节点 k8s-10.67.42.11 中。
我认为这是因为有标签为“app.kubernetes.io/name”的节点
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- redis-sentinel
[root@k8s-10 ~]# kubectl get pod -n project -l app.kubernetes.io/name=redis-sentinel -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dol-redis-sentinel-0 1/1 Running 0 18h 10.244.243.2 k8s-10.67.42.12 <none> <none>
dol-redis-sentinel-1 1/1 Running 0 16h 10.244.35.170 k8s-10.67.42.10 <none> <none>
dol-redis-sentinel-2 0/1 Pending 0 29m <none> <none> <none> <none>
如果存在具有反亲和性规则的其他 pod,阻止具有“相同标签”的新 pod 在同一节点上调度,也可能会发生这种情况。