我在使用带有普罗米修斯触发器的 Keda 缩放对象时遇到了麻烦。由于某种原因,Keda 缩放对象无法看到从 .NET 应用程序公开的 prometheus 指标。
我不确定是不是因为缩放对象中的普罗米修斯服务器详细信息不正确,或者是一些更微妙的东西导致了我的问题。
使用 Azure Kubernetes 服务我创建了一个 3 节点集群。
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-nodepool1-42472506-vmss00000l Ready agent 39m v1.24.9 10.224.0.4 <none> Ubuntu 18.04.6 LTS 5.4.0-1104-azure containerd://1.6.18+azure-1
aks-nodepool1-42472506-vmss00000m Ready agent 39m v1.24.9 10.224.0.6 <none> Ubuntu 18.04.6 LTS 5.4.0-1104-azure containerd://1.6.18+azure-1
aks-nodepool1-42472506-vmss00000n Ready agent 39m v1.24.9 10.224.0.5 <none> Ubuntu 18.04.6 LTS 5.4.0-1104-azure containerd://1.6.18+azure-1
有一个deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: servant-deploy
labels:
app: servant-app
spec:
template:
metadata:
labels:
app: servant-app
spec:
volumes:
- name: servant-app-volume
configMap:
name: servant-app
containers:
- name: servant-app
image: scalingacr.azurecr.io/statelessservantservice_image
volumeMounts:
- name: servant-app-volume
mountPath: /app/appsettings.k8s.json
subPath: appsettings.k8s.json
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m
imagePullSecrets:
- name: acr-secret
selector:
matchLabels:
app: servant-app
关联服务.yaml:
apiVersion: v1
kind: Service
metadata:
name: servant-service
labels:
app: servant-app
spec:
type: LoadBalancer
selector:
app: servant-app
ports:
- protocol: TCP
name: web
port: 80
targetPort: 80
kubectl apply -f deplyment.yaml
kubectl apply -f service.yaml
已安装prometheus operator如下:
git clone https://github.com/prometheus-operator/kube-prometheus.git
# Navigate to the kube-prometheus folder and run the commands (as detailed in https://github.com/prometheus-operator/kube-prometheus/tree/release-0.12):
kubectl apply --server-side -f manifests/setup
kubectl wait \
--for condition=Established \
--all CustomResourceDefinition \
--namespace=monitoring
kubectl apply -f manifests/
已创建 prometheus 和关联的服务监视器以允许 prometheus 从我的 .NET 应用程序中抓取指标:
普罗米修斯.yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.32.1
name: applications
namespace: monitoring
spec:
image: quay.io/prometheus/prometheus:v2.32.1
nodeSelector:
kubernetes.io/os: linux
replicas: 1
resources:
requests:
memory: 400Mi
ruleSelector: {}
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {} #match all namespaces
#serviceMonitorNamespaceSelector:
# matchLabels:
# kubernetes.io/metadata.name: default
serviceMonitorSelector: {} #match all servicemonitors
version: v2.32.1
servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
name: servant-app
namespace: default
spec:
endpoints:
- interval: 30s
port: web
selector:
matchLabels:
app: servant-app
kubectl apply -f prometheus.yaml
kubectl apply -f servicemonitor.yaml
kubectl -n monitoring get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-main ClusterIP 10.0.29.217 <none> 9093/TCP,8080/TCP 34m
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 34m
blackbox-exporter ClusterIP 10.0.161.43 <none> 9115/TCP,19115/TCP 34m
grafana ClusterIP 10.0.66.180 <none> 3000/TCP 34m
kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 34m
node-exporter ClusterIP None <none> 9100/TCP 34m
prometheus-adapter ClusterIP 10.0.120.223 <none> 443/TCP 34m
prometheus-k8s ClusterIP 10.0.167.102 <none> 9090/TCP,8080/TCP 34m
prometheus-operated ClusterIP None <none> 9090/TCP 34m
prometheus-operator ClusterIP None <none> 8443/TCP 34m
可以使用命令访问prometheus UI:
kubectl -n monitoring port-forward service/prometheus-operated 9090:9090
可以在 prometheus 中看到我的 .NET 应用程序指标 hs_ready_readiness。
安装科达如下:
# Install helm 3
curl -fsSL -o get_helm.sh <address of helm 3>
chmod 700 get_helm.sh
./get_helm.sh
# Install Keda
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
kubectl create namespace keda
helm install keda kedacore/keda --namespace keda --set metricsServer.useHostNetwork=true --set prometheus.metricServer.enabled=true
kubectl -n keda get all
NAME READY STATUS RESTARTS AGE
pod/keda-admission-webhooks-6cd9cdbff8-mjr8r 1/1 Running 0 91m
pod/keda-operator-7d5994cbf9-bpqpv 1/1 Running 0 91m
pod/keda-operator-metrics-apiserver-67879677bd-xhwp4 1/1 Running 0 91m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/keda-admission-webhooks ClusterIP 10.0.226.117 <none> 443/TCP 3h46m
service/keda-operator ClusterIP 10.0.228.219 <none> 9666/TCP 3h46m
service/keda-operator-metrics-apiserver ClusterIP 10.0.87.217 <none> 443/TCP,80/TCP,9022/TCP 3h46m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/keda-admission-webhooks 1/1 1 1 3h46m
deployment.apps/keda-operator 1/1 1 1 3h46m
deployment.apps/keda-operator-metrics-apiserver 1/1 1 1 3h46m
NAME DESIRED CURRENT READY AGE
replicaset.apps/keda-admission-webhooks-6cd9cdbff8 1 1 1 3h46m
replicaset.apps/keda-operator-7d5994cbf9 1 1 1 3h46m
replicaset.apps/keda-operator-metrics-apiserver-67879677bd 1 1 1 3h46m
随着 keda 的运行,我应用以下缩放对象来获取我的指标 hs_ready_readiness:
hs-keda3.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaledobject
namespace: default
spec:
scaleTargetRef:
name: servant-deploy
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-operated.monitoring.svc.cluster.local:80
metricName: hs_ready_readiness
threshold: '1'
query: hs_ready_readiness
kubectl get all
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/keda-hpa-prometheus-scaledobject Deployment/servant-deploy <unknown>/1 (avg) 1 100 1 3m23s
kubectl describe horizontalpodautoscaler.autoscaling/keda-hpa-prometheus-scaledobject
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetExternalMetric the HPA was unable to compute the replica count: unable to get external metric default/s0-prometheus-hs_ready_readiness/&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: prometheus-scaledobject,},MatchExpressions:[]LabelSelectorRequirement{},}: unable to fetch metrics from external metrics API: Get "https://hcp-kubernetes.6416bc6ff2dbef0001660539.svc.cluster.local:443/apis/external.metrics.k8s.io/v1beta1/namespaces/default/s0-prometheus-hs_ready_readiness?labelSelector=scaledobject.keda.sh%!F(MISSING)name%!D(MISSING)prometheus-scaledobject": stream error: stream ID 29285; INTERNAL_ERROR; received from peer
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetExternalMetric 76s horizontal-pod-autoscaler unable to get external metric default/s0-prometheus-hs_ready_readiness/&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: prometheus-scaledobject,},MatchExpressions:[]LabelSelectorRequirement{},}: unable to fetch metrics from external metrics API: Get "https://hcp-kubernetes.6416bc6ff2dbef0001660539.svc.cluster.local:443/apis/external.metrics.k8s.io/v1beta1/namespaces/default/s0-prometheus-hs_ready_readiness?labelSelector=scaledobject.keda.sh%2Fname%3Dprometheus-scaledobject": stream error: stream ID 28963; INTERNAL_ERROR; received from peer
Warning FailedComputeMetricsReplicas 76s horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get s0-prometheus-hs_ready_readiness external metric: unable to get external metric default/s0-prometheus-hs_ready_readiness/&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: prometheus-scaledobject,},MatchExpressions:[]LabelSelectorRequirement{},}: unable to fetch metrics from external metrics API: Get "https://hcp-kubernetes.6416bc6ff2dbef0001660539.svc.cluster.local:443/apis/external.metrics.k8s.io/v1beta1/namespaces/default/s0-prometheus-hs_ready_readiness?labelSelector=scaledobject.keda.sh%2Fname%3Dprometheus-scaledobject": stream error: stream ID 28963; INTERNAL_ERROR; received from peer
kubectl get apiservices
NAME SERVICE AVAILABLE AGE
v1. Local True 3d3h
v1.admissionregistration.k8s.io Local True 3d3h
v1.apiextensions.k8s.io Local True 3d3h
v1.apps Local True 3d3h
v1.authentication.k8s.io Local True 3d3h
v1.authorization.k8s.io Local True 3d3h
v1.autoscaling Local True 3d3h
v1.batch Local True 3d3h
v1.certificates.k8s.io Local True 3d3h
v1.coordination.k8s.io Local True 3d3h
v1.discovery.k8s.io Local True 3d3h
v1.events.k8s.io Local True 3d3h
v1.monitoring.coreos.com Local True 36m
v1.networking.k8s.io Local True 3d3h
v1.node.k8s.io Local True 3d3h
v1.policy Local True 3d3h
v1.rbac.authorization.k8s.io Local True 3d3h
v1.scheduling.k8s.io Local True 3d3h
v1.snapshot.storage.k8s.io Local True 3h56m
v1.storage.k8s.io Local True 3d3h
v1alpha1.keda.sh Local True 94m
v1alpha1.monitoring.coreos.com Local True 36m
v1beta1.batch Local True 3d3h
v1beta1.discovery.k8s.io Local True 3d3h
v1beta1.events.k8s.io Local True 3d3h
v1beta1.external.metrics.k8s.io keda/keda-operator-metrics-apiserver True 3h48m
v1beta1.flowcontrol.apiserver.k8s.io Local True 3d3h
v1beta1.metrics.k8s.io kube-system/metrics-server True 40m
v1beta1.node.k8s.io Local True 3d3h
v1beta1.policy Local True 3d3h
v1beta1.snapshot.storage.k8s.io Local True 3h56m
v1beta1.storage.k8s.io Local True 3d3h
v1beta2.flowcontrol.apiserver.k8s.io Local True 3d3h
v2.autoscaling Local True 3d3h
v2beta1.autoscaling Local True 3d3h
v2beta2.autoscaling Local True 3d3h
对于为什么我的 Keda 缩放对象看不到我的 .NET 应用程序指标的任何想法都会非常感激。