我正在尝试使用kubeadm安装Kubernetes后创建一个水平pod自动缩放。
主要症状是kubectl get hpa
将TARGETS
列中的CPU指标返回为“undefined”:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
fibonacci Deployment/fibonacci <unknown> / 50% 1 3 1 1h
在进一步调查中,似乎hpa
正在尝试从Heapster接收CPU指标 - 但在我的配置中,cpu指标由cAdvisor提供。
我根据kubectl describe hpa fibonacci
的输出做出这个假设:
Name: fibonacci
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Sun, 14 May 2017 18:08:53 +0000
Reference: Deployment/fibonacci
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 50%
Min replicas: 1
Max replicas: 3
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 3s 148 horizontal-pod-autoscaler Warning FailedGetResourceMetric unable to get metrics for resource cpu: no metrics returned from heapster
1h 3s 148 horizontal-pod-autoscaler Warning FailedComputeMetricsReplicas failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from heapster
为什么hpa
试图从heapster而不是cAdvisor接收这个指标?
我怎样才能解决这个问题?
请在下面找到我的部署,以及/var/log/container/kube-controller-manager.log
的内容以及kubectl get pods --namespace=kube-system
和kubectl describe pods
的输出
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: fibonacci
labels:
app: fibonacci
spec:
template:
metadata:
labels:
app: fibonacci
spec:
containers:
- name: fibonacci
image: oghma/fibonacci
ports:
- containerPort: 8088
resources:
requests:
memory: "64Mi"
cpu: "75m"
limits:
memory: "128Mi"
cpu: "100m"
---
kind: Service
apiVersion: v1
metadata:
name: fibonacci
spec:
selector:
app: fibonacci
ports:
- protocol: TCP
port: 8088
targetPort: 8088
externalIPs:
- 192.168.66.103
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: fibonacci
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: fibonacci
minReplicas: 1
maxReplicas: 3
targetCPUUtilizationPercentage: 50
$ kubectl describe pods
Name: fibonacci-1503002127-3k755
Namespace: default
Node: kubernetesnode1/192.168.66.101
Start Time: Sun, 14 May 2017 17:47:08 +0000
Labels: app=fibonacci
pod-template-hash=1503002127
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"fibonacci-1503002127","uid":"59ea64bb-38cd-11e7-b345-fa163edb1ca...
Status: Running
IP: 192.168.202.1
Controllers: ReplicaSet/fibonacci-1503002127
Containers:
fibonacci:
Container ID: docker://315375c6a978fd689f4ba61919c15f15035deb9139982844cefcd46092fbec14
Image: oghma/fibonacci
Image ID: docker://sha256:26f9b6b2c0073c766b472ec476fbcd2599969b6e5e7f564c3c0a03f8355ba9f6
Port: 8088/TCP
State: Running
Started: Sun, 14 May 2017 17:47:16 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 128Mi
Requests:
cpu: 75m
memory: 64Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-45kp8 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-45kp8:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-45kp8
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events: <none>
$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-etcd-k1g53 1/1 Running 0 2h
calico-node-6n4gp 2/2 Running 1 2h
calico-node-nhmz7 2/2 Running 0 2h
calico-policy-controller-1324707180-65m78 1/1 Running 0 2h
etcd-kubernetesmaster 1/1 Running 0 2h
heapster-1428305041-zjzd1 1/1 Running 0 1h
kube-apiserver-kubernetesmaster 1/1 Running 0 2h
kube-controller-manager-kubernetesmaster 1/1 Running 0 2h
kube-dns-3913472980-gbg5h 3/3 Running 0 2h
kube-proxy-1dt3c 1/1 Running 0 2h
kube-proxy-tfhr9 1/1 Running 0 2h
kube-scheduler-kubernetesmaster 1/1 Running 0 2h
monitoring-grafana-3975459543-9q189 1/1 Running 0 1h
monitoring-influxdb-3480804314-7bvr3 1/1 Running 0 1h
$ cat /var/log/container/kube-controller-manager.log
"log":"I0514 17:47:08.631314 1 event.go:217] Event(v1.ObjectReference{Kind:\"Deployment\", Namespace:\"default\", Name:\"fibonacci\", UID:\"59e980d9-38cd-11e7-b345-fa163edb1ca6\", APIVersion:\"extensions\", ResourceVersion:\"1303\", FieldPath:\"\"}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set fibonacci-1503002127 to 1\n","stream":"stderr","time":"2017-05-14T17:47:08.63177467Z"}
{"log":"I0514 17:47:08.650662 1 event.go:217] Event(v1.ObjectReference{Kind:\"ReplicaSet\", Namespace:\"default\", Name:\"fibonacci-1503002127\", UID:\"59ea64bb-38cd-11e7-b345-fa163edb1ca6\", APIVersion:\"extensions\", ResourceVersion:\"1304\", FieldPath:\"\"}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: fibonacci-1503002127-3k755\n","stream":"stderr","time":"2017-05-14T17:47:08.650826398Z"}
{"log":"E0514 17:49:00.873703 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:49:00.874034952Z"}
{"log":"E0514 17:49:30.884078 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:49:30.884546461Z"}
{"log":"E0514 17:50:00.896563 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:50:00.89688734Z"}
{"log":"E0514 17:50:30.906293 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:50:30.906825794Z"}
{"log":"E0514 17:51:00.915996 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:51:00.916348218Z"}
{"log":"E0514 17:51:30.926043 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:51:30.926367623Z"}
{"log":"E0514 17:52:00.936574 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:52:00.936903072Z"}
{"log":"E0514 17:52:30.944724 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:52:30.945120508Z"}
{"log":"E0514 17:53:00.954785 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:53:00.955126309Z"}
{"log":"E0514 17:53:30.970454 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:53:30.972996568Z"}
{"log":"E0514 17:54:00.980735 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:54:00.981098832Z"}
{"log":"E0514 17:54:30.993176 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:54:30.993538841Z"}
{"log":"E0514 17:55:01.002941 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:55:01.003265908Z"}
{"log":"W0514 17:55:06.511756 1 reflector.go:323] k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:192: watch of \u003cnil\u003e ended with: etcdserver: mvcc: required revision has been compacted\n","stream":"stderr","time":"2017-05-14T17:55:06.511957851Z"}
{"log":"E0514 17:55:31.013415 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:55:31.013776243Z"}
{"log":"E0514 17:56:01.024507 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:56:01.0248332Z"}
{"log":"E0514 17:56:31.036191 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:56:31.036606698Z"}
{"log":"E0514 17:57:01.049277 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:57:01.049616359Z"}
{"log":"E0514 17:57:31.064104 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:57:31.064489485Z"}
{"log":"E0514 17:58:01.073988 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:58:01.074339488Z"}
{"log":"E0514 17:58:31.084511 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:58:31.084839352Z"}
{"log":"E0514 17:59:01.096507 1 horizontal.go:201] failed to compute desired number of replicas based on listed metrics for Deployment/default/fibonacci: failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services http:heapster:)\n","stream":"stderr","time":"2017-05-14T17:59:01.096896254Z"}
您可以从部署中删除LIMITS并尝试它。在我的部署中,我只使用REQUESTS for RESOURCES而且它有效。如果您看到Horizontal Pod Autoscaler(HPA)正在运行,那么稍后您也可以使用LIMITS。 This discussion告诉您,仅使用REQUESTS就足以完成HPA。
我在其他应用程序中也看到了这一点:HPA API中似乎存在一个错误。
解决方案可以是使用复制控制器scaleref:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: fibonacci
namespace: ....
spec:
scaleRef:
kind: ReplicationController
name: fibonacci
subresource: scale
minReplicas: 1
maxReplicas: 3
targetCPUUtilizationPercentage: 50
未经测试,所以可能需要对scaleRef
进行一些编辑(你使用的是scaleTargetRef
)
有一个选项可以在群集池上启用自动扩展,请确保首先打开它。
然后应用你的hpa,不要忘记设置cpu,内存请求,k8s控制器的限制
需要注意的一件事是,如果您的pod上有多个容器,则应该为每个容器指定cpu,内存请求和限制
如果您使用GKE 1.9.x
有一些错误,需要首先禁用自动缩放然后重新启用它。这将提供当前值以代替未知
尝试更新到最新的GKE。