Cluster-autoscaler不会触发Daemonset部署的扩展

问题描述 投票:1回答:1

我使用Datadog Helm chart部署了Datadog代理,Daemonset在Kubernetes中部署了NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE datadog-agent-datadog 5 2 2 2 2 <none> 1h 。但是,当检查Daemonset的状态时,我看到它没有创建所有pod:

Daemonset

在描述Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedPlacement 42s (x6 over 42s) daemonset-controller failed to place pod on "ip-10-0-1-124.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Warning FailedPlacement 42s (x6 over 42s) daemonset-controller failed to place pod on "<ip>": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Warning FailedPlacement 42s (x5 over 42s) daemonset-controller failed to place pod on "<ip>": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000 Warning FailedPlacement 42s (x7 over 42s) daemonset-controller failed to place pod on "<ip>": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000 Normal SuccessfulCreate 42s daemonset-controller Created pod: datadog-agent-7b2kp 以弄清楚出了什么问题时,我发现它没有足够的资源:

Cluster-autoscaler

但是,我已经在群集中安装了Pod并且配置正确(它确实触发了没有足够资源来安排的常规Daemonset部署),但它似乎没有在I0424 14:14:48.545689 1 static_autoscaler.go:273] No schedulable pods I0424 14:14:48.545700 1 static_autoscaler.go:280] No unschedulable pods 上触发:

enter image description here

AutoScalingGroup有足够的节点:

Daemonset

我是否遗漏了Cluster-autoscaler的配置?我该怎么做才能确保它在Name: datadog-agent Selector: app=datadog-agent Node-Selector: <none> Labels: app=datadog-agent chart=datadog-1.27.2 heritage=Tiller release=datadog-agent Annotations: deprecated.daemonset.template.generation: 1 Desired Number of Nodes Scheduled: 5 Current Number of Nodes Scheduled: 2 Number of Nodes Scheduled with Up-to-date Pods: 2 Number of Nodes Scheduled with Available Pods: 2 Number of Nodes Misscheduled: 0 Pods Status: 2 Running / 0 Waiting / 0 Succeeded / 0 Failed Pod Template: Labels: app=datadog-agent Annotations: checksum/autoconf-config: 38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed checksum/checksd-config: 38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed checksum/confd-config: 38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed Service Account: datadog-agent Containers: datadog: Image: datadog/agent:6.10.1 Port: 8125/UDP Host Port: 0/UDP Limits: cpu: 200m memory: 256Mi Requests: cpu: 200m memory: 256Mi Liveness: http-get http://:5555/health delay=15s timeout=5s period=15s #success=1 #failure=6 Environment: DD_API_KEY: <set to the key 'api-key' in secret 'datadog-secret'> Optional: false DD_LOG_LEVEL: INFO KUBERNETES: yes DD_KUBERNETES_KUBELET_HOST: (v1:status.hostIP) DD_HEALTH_PORT: 5555 Mounts: /host/proc from procdir (ro) /host/sys/fs/cgroup from cgroups (ro) /var/run/docker.sock from runtimesocket (ro) /var/run/s6 from s6-run (rw) Volumes: runtimesocket: Type: HostPath (bare host directory volume) Path: /var/run/docker.sock HostPathType: procdir: Type: HostPath (bare host directory volume) Path: /proc HostPathType: cgroups: Type: HostPath (bare host directory volume) Path: /sys/fs/cgroup HostPathType: s6-run: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedPlacement 33m (x6 over 33m) daemonset-controller failed to place pod on "ip-10-0-2-144.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Normal SuccessfulCreate 33m daemonset-controller Created pod: datadog-agent-7b2kp Warning FailedPlacement 16m (x25 over 33m) daemonset-controller failed to place pod on "ip-10-0-1-124.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Warning FailedPlacement 16m (x25 over 33m) daemonset-controller failed to place pod on "ip-10-0-2-174.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000 Warning FailedPlacement 16m (x25 over 33m) daemonset-controller failed to place pod on "ip-10-0-3-250.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000 资源上触发?

编辑:描述守护进程

qazxswpoi
kubernetes
1个回答
1
投票

您应该了解集群自动缩放器的工作原理。它仅负责添加或删除节点。它不负责创建或销毁pod。因此,在您的情况下,群集自动缩放器没有做任何事情,因为它没用。即使您再添加一个节点 - 仍然需要在CPU不足的节点上运行DaemonSet pod。这就是它不添加节点的原因。

您应该做的是从占用节点手动删除一些pod。然后它将能够安排DaemonSet pods。

或者,您可以将Datadog的CPU请求减少到例如100米或50米。这应该足以启动这些pod。

© www.soinside.com 2019 - 2024. All rights reserved.