如何解决'容器运行时间下降,PLEG不健康'

问题描述 投票:2回答:1

我有一个拥有2个节点的kubernetes集群。每个节点有大约6-7个pod,每个pod有2个容器。一个容器是我的docker镜像,另一个是istio为其服务网格创建的。但是大约10个小时后,节点变得“没有准备好”,并且节点描述显示出2个错误:1。容器运行时间下降,PLEG不健康:pleg最后活动1h32m35.942907195s前;阈值是3m0s。 2.rpc错误:code = DeadlineExceeded desc =超出上下文截止时间,无法连接到unix:///var/run/docker.sock上的Docker守护程序。 docker守护程序是否正在运行?

当我重新启动节点时,它工作正常但是,节点在一段时间后返回“未准备好”。自从添加istio后开始面对这个问题,但找不到与这两者相关的任何文档。下一步是尝试升级kubernetes

节点描述日志:

Name:               aks-agentpool-22124581-0
Roles:              agent
Labels:             agentpool=agentpool
                    beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_B2s
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=eastus
                    failure-domain.beta.kubernetes.io/zone=1
                    kubernetes.azure.com/cluster=MC_XXXXXXXXX
                    kubernetes.io/hostname=aks-XXXXXXXXX
                    kubernetes.io/role=agent
                    node-role.kubernetes.io/agent=
                    storageprofile=managed
                    storagetier=Premium_LRS
Annotations:        aks.microsoft.com/remediated=3
                    node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp:  Thu, 25 Oct 2018 14:46:53 +0000
Taints:             <none>
Unschedulable:      false
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Thu, 25 Oct 2018 14:49:06 +0000   Thu, 25 Oct 2018 14:49:06 +0000   RouteCreated                 RouteController created a route
  OutOfDisk            False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure       False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Wed, 19 Dec 2018 19:28:55 +0000   Thu, 25 Oct 2018 14:46:53 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletNotReady              container runtime is down,PLEG is not healthy: pleg was lastseen active 1h32m35.942907195s ago; threshold is 3m0s
Addresses:
  Hostname:  aks-XXXXXXXXX
Capacity:
 cpu:                2
 ephemeral-storage:  30428648Ki
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             4040536Ki
 pods:               110
Allocatable:
 cpu:                1940m
 ephemeral-storage:  28043041951
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             3099480Ki
 pods:               110
System Info:
 Machine ID:                 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 System UUID:                XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 Boot ID:                    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 Kernel Version:             4.15.0-1035-azure
 OS Image:                   Ubuntu 16.04.5 LTS
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://Unknown
 Kubelet Version:            v1.11.3
 Kube-Proxy Version:         v1.11.3
PodCIDR:                     10.244.0.0/24
ProviderID:                  azure:///subscriptions/9XXXXXXXXXXX/resourceGroups/MC_XXXXXXXXXXXXXXXXXXXXXXXXXXXX/providers/Microsoft.Compute/virtualMachines/aks-XXXXXXXXXXXX
Non-terminated Pods:         (42 in total)
  Namespace                  Name                                                               CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                                                               ------------  ----------  ---------------  -------------
  default                    emailgistics-graph-monitor-6477568564-q98p2                        10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-message-handler-7df4566b6f-mh255                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-reports-aggregator-5fd96b94cb-b5vbn                   10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-rules-844b77f46-5lrkw                                 10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-scheduler-754884b566-mwgvp                            10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-subscription-token-manager-7974558985-f2t49           10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    mollified-kiwi-cert-manager-665c5d9c8c-2ld59                       0 (0%)        0 (0%)      0 (0%)           0 (0%)
  istio-system               grafana-59b787b9b-dzdtc                                            10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-citadel-5d8956cc6-x55vk                                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-egressgateway-f48fc7fbb-szpwp                                10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-galley-6975b6bd45-g7lsc                                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-ingressgateway-c6c4bcdbf-bbgcw                               10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-pilot-d9b5b9b7c-ln75n                                        510m (26%)    0 (0%)      2Gi (67%)        0 (0%)
  istio-system               istio-policy-6b465cd4bf-92l57                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-policy-6b465cd4bf-b2z85                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-policy-6b465cd4bf-j59r4                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-policy-6b465cd4bf-s9pdm                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-sidecar-injector-575597f5cf-npkcz                            10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-9794j                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-g7gh5                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-gd88n                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-px8qb                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-xzslh                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-tracing-7596597bd7-hjtq2                                     10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               prometheus-76db5fddd5-d6dxs                                        10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               servicegraph-758f96bf5b-c9sqk                                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  kube-system                addon-http-application-routing-default-http-backend-5ccb95zgfm8    10m (0%)      10m (0%)    20Mi (0%)        20Mi (0%)
  kube-system                addon-http-application-routing-external-dns-59d8698886-h8xds       0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                addon-http-application-routing-nginx-ingress-controller-ff49qc7    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                heapster-5d6f9b846c-m4kfp                                          130m (6%)     130m (6%)   230Mi (7%)       230Mi (7%)
  kube-system                kube-dns-v20-7c7d7d4c66-qqkfm                                      120m (6%)     0 (0%)      140Mi (4%)       220Mi (7%)
  kube-system                kube-dns-v20-7c7d7d4c66-wrxjm                                      120m (6%)     0 (0%)      140Mi (4%)       220Mi (7%)
  kube-system                kube-proxy-2tb68                                                   100m (5%)     0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-svc-redirect-d6gqm                                            10m (0%)      0 (0%)      34Mi (1%)        0 (0%)
  kube-system                kubernetes-dashboard-68f468887f-l9x46                              100m (5%)     100m (5%)   50Mi (1%)        300Mi (9%)
  kube-system                metrics-server-5cbc77f79f-x55cs                                    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                omsagent-mhrqm                                                     50m (2%)      150m (7%)   150Mi (4%)       300Mi (9%)
  kube-system                omsagent-rs-d688cdf68-pjpmj                                        50m (2%)      150m (7%)   100Mi (3%)       500Mi (16%)
  kube-system                tiller-deploy-7f4974b9c8-flkjm                                     0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                tunnelfront-7f766dd857-kgqps                                       10m (0%)      0 (0%)      64Mi (2%)        0 (0%)
  kube-systems-dev           nginx-ingress-dev-controller-7f78f6c8f9-csct4                      0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-systems-dev           nginx-ingress-dev-default-backend-95fbc75b7-lq9tw                  0 (0%)        0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource  Requests      Limits
  --------  --------      ------
  cpu       1540m (79%)   540m (27%)
  memory    2976Mi (98%)  1790Mi (59%)
Events:
  Type     Reason             Age                 From                               Message
  ----     ------             ----                ----                               -------
  Warning  ContainerGCFailed  48m (x43 over 19h)  kubelet, aks-agentpool-22124581-0  rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Warning  ImageGCFailed      29m (x57 over 18h)  kubelet, aks-agentpool-22124581-0  failed to get image stats: rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
  Warning  ContainerGCFailed  2m (x237 over 18h)  kubelet, aks-agentpool-22124581-0  rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

一般部署文件:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  creationTimestamp: null
  name: emailgistics-pod
spec:
  minReadySeconds: 10
  replicas: 1
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      annotations:
        sidecar.istio.io/status: '{"version":"ebf16d3ea0236e4b5cb4d3fc0f01da62e2e6265d005e58f8f6bd43a4fb672fdd","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs"],"imagePullSecrets":null}'
      creationTimestamp: null
      labels:
        app: emailgistics-pod
    spec:
      containers:
      - image: xxxxxxxxxxxxxxxxxxxxx/emailgistics_pod:xxxxxx
        imagePullPolicy: Always
        name: emailgistics-pod
        ports:
        - containerPort: 80
        resources: {}
      - args:
        - proxy
        - sidecar
        - --configPath
        - /etc/istio/proxy
        - --binaryPath
        - /usr/local/bin/envoy
        - --serviceCluster
        - emailgistics-pod
        - --drainDuration
        - 45s
        - --parentShutdownDuration
        - 1m0s
        - --discoveryAddress
        - istio-pilot.istio-system:15005
        - --discoveryRefreshDelay
        - 1s
        - --zipkinAddress
        - zipkin.istio-system:9411
        - --connectTimeout
        - 10s
        - --proxyAdminPort
        - "15000"
        - --controlPlaneAuthPolicy
        - MUTUAL_TLS
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: INSTANCE_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: ISTIO_META_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: ISTIO_META_INTERCEPTION_MODE
          value: REDIRECT
        - name: ISTIO_METAJSON_LABELS
          value: |
            {"app":"emailgistics-pod"}
        image: docker.io/istio/proxyv2:1.0.4
        imagePullPolicy: IfNotPresent
        name: istio-proxy
        ports:
        - containerPort: 15090
          name: http-envoy-prom
          protocol: TCP
        resources:
          requests:
            cpu: 10m
        securityContext:
          readOnlyRootFilesystem: true
          runAsUser: 1337
        volumeMounts:
        - mountPath: /etc/istio/proxy
          name: istio-envoy
        - mountPath: /etc/certs/
          name: istio-certs
          readOnly: true
      imagePullSecrets:
      - name: ga.secretname
      initContainers:
      - args:
        - -p
        - "15001"
        - -u
        - "1337"
        - -m
        - REDIRECT
        - -i
        - '*'
        - -x
        - ""
        - -b
        - "80"
        - -d
        - ""
        image: docker.io/istio/proxy_init:1.0.4
        imagePullPolicy: IfNotPresent
        name: istio-init
        resources: {}
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
          privileged: true
      volumes:
      - emptyDir:
          medium: Memory
        name: istio-envoy
      - name: istio-certs
        secret:
          optional: true
          secretName: istio.default
status: {}
---
docker kubernetes istio azure-kubernetes azure-aks
1个回答
2
投票

目前这是一个已知的错误,并没有创建真正的修复程序来规范化节点行为。检查以下网址:

https://github.com/kubernetes/kubernetes/issues/45419

https://github.com/kubernetes/kubernetes/issues/61117

https://github.com/Azure/AKS/issues/102

希望很快我们会有一个解决方案。

© www.soinside.com 2019 - 2024. All rights reserved.