Cluster-autoscaler不会触发Daemonset部署的扩展

Question

我使用Datadog Helm chart部署了Datadog代理，Daemonset在Kubernetes中部署了NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE datadog-agent-datadog 5 2 2 2 2 <none> 1h。但是，当检查Daemonset的状态时，我看到它没有创建所有pod：

Daemonset

在描述Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedPlacement 42s (x6 over 42s) daemonset-controller failed to place pod on "ip-10-0-1-124.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Warning FailedPlacement 42s (x6 over 42s) daemonset-controller failed to place pod on "<ip>": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Warning FailedPlacement 42s (x5 over 42s) daemonset-controller failed to place pod on "<ip>": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000 Warning FailedPlacement 42s (x7 over 42s) daemonset-controller failed to place pod on "<ip>": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000 Normal SuccessfulCreate 42s daemonset-controller Created pod: datadog-agent-7b2kp以弄清楚出了什么问题时，我发现它没有足够的资源：

Cluster-autoscaler

但是，我已经在群集中安装了Pod并且配置正确（它确实触发了没有足够资源来安排的常规Daemonset部署），但它似乎没有在I0424 14:14:48.545689 1 static_autoscaler.go:273] No schedulable pods I0424 14:14:48.545700 1 static_autoscaler.go:280] No unschedulable pods上触发：

AutoScalingGroup有足够的节点：

Daemonset

我是否遗漏了Cluster-autoscaler的配置？我该怎么做才能确保它在Name: datadog-agent Selector: app=datadog-agent Node-Selector: <none> Labels: app=datadog-agent chart=datadog-1.27.2 heritage=Tiller release=datadog-agent Annotations: deprecated.daemonset.template.generation: 1 Desired Number of Nodes Scheduled: 5 Current Number of Nodes Scheduled: 2 Number of Nodes Scheduled with Up-to-date Pods: 2 Number of Nodes Scheduled with Available Pods: 2 Number of Nodes Misscheduled: 0 Pods Status: 2 Running / 0 Waiting / 0 Succeeded / 0 Failed Pod Template: Labels: app=datadog-agent Annotations: checksum/autoconf-config: 38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed checksum/checksd-config: 38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed checksum/confd-config: 38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed Service Account: datadog-agent Containers: datadog: Image: datadog/agent:6.10.1 Port: 8125/UDP Host Port: 0/UDP Limits: cpu: 200m memory: 256Mi Requests: cpu: 200m memory: 256Mi Liveness: http-get http://:5555/health delay=15s timeout=5s period=15s #success=1 #failure=6 Environment: DD_API_KEY: <set to the key 'api-key' in secret 'datadog-secret'> Optional: false DD_LOG_LEVEL: INFO KUBERNETES: yes DD_KUBERNETES_KUBELET_HOST: (v1:status.hostIP) DD_HEALTH_PORT: 5555 Mounts: /host/proc from procdir (ro) /host/sys/fs/cgroup from cgroups (ro) /var/run/docker.sock from runtimesocket (ro) /var/run/s6 from s6-run (rw) Volumes: runtimesocket: Type: HostPath (bare host directory volume) Path: /var/run/docker.sock HostPathType: procdir: Type: HostPath (bare host directory volume) Path: /proc HostPathType: cgroups: Type: HostPath (bare host directory volume) Path: /sys/fs/cgroup HostPathType: s6-run: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedPlacement 33m (x6 over 33m) daemonset-controller failed to place pod on "ip-10-0-2-144.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Normal SuccessfulCreate 33m daemonset-controller Created pod: datadog-agent-7b2kp Warning FailedPlacement 16m (x25 over 33m) daemonset-controller failed to place pod on "ip-10-0-1-124.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1810, capacity: 2000 Warning FailedPlacement 16m (x25 over 33m) daemonset-controller failed to place pod on "ip-10-0-2-174.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000 Warning FailedPlacement 16m (x25 over 33m) daemonset-controller failed to place pod on "ip-10-0-3-250.eu-west-1.compute.internal": Node didn't have enough resource: cpu, requested: 200, used: 1860, capacity: 2000资源上触发？

编辑：描述守护进程

qazxswpoi

Answer 1

您应该了解集群自动缩放器的工作原理。它仅负责添加或删除节点。它不负责创建或销毁pod。因此，在您的情况下，群集自动缩放器没有做任何事情，因为它没用。即使您再添加一个节点 - 仍然需要在CPU不足的节点上运行DaemonSet pod。这就是它不添加节点的原因。

您应该做的是从占用节点手动删除一些pod。然后它将能够安排DaemonSet pods。

或者，您可以将Datadog的CPU请求减少到例如100米或50米。这应该足以启动这些pod。

Cluster-autoscaler不会触发Daemonset部署的扩展

问题描述投票：1回答：1

1个回答

最新问题

Cluster-autoscaler不会触发Daemonset部署的扩展

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1