K8s版本:1.27.3 |资源 2 个主控中的 12 个 CPU、16GB RAM Haproxy版本:2.0 垫片: Cri-docker 码头工人:20.11
HAPROXY CONFIG:
frontend kubernetes-frontend
bind 192.168.159.105:6443
mode tcp
option tcplog
default_backend kubernetes-backend
backend kubernetes-backend
mode tcp
option tcp-check
balance roundrobin
server k8controller 192.168.159.53:6443 check fall 3 rise 2
server k8controller2 192.168.159.106:6443 check fall 3 rise 2
我有一个配置了 HAProxy 的 kubernetes 多节点集群设置,所以我已经有工人和主人在工作,但重点是下一个:
当其中一个主机出现故障时,另一个主机会接管来查看和管理 Pod 或其他内容,但我收到了此错误:
I0802 06:23:25.503865 363393 cached_discovery.go:77] returning cached discovery info from /home/mw/.kube/cache/discovery/192.168.159.105_6443/v1/serverresources.json
I0802 06:23:25.506496 363393 round_trippers.go:466] curl -v -XGET -H "Accept: application/json;as=Table;v=v1;g=meta.k8s.io,application/json;as=Table;v=v1beta1;g=meta.k8s.io,application/json" -H "User-Agent: kubectl/v1.27.3 (linux/amd64) kubernetes/25b4e43" 'https://192.168.159.105:6443/api/v1/namespaces/default/pods?limit=500'
I0802 06:23:25.508591 363393 round_trippers.go:510] HTTP Trace: Dial to tcp:192.168.159.105:6443 succeed
I0802 06:23:38.766834 363393 round_trippers.go:553] GET https://192.168.159.105:6443/api/v1/namespaces/default/pods?limit=500 500 Internal Server Error in 13260 milliseconds
I0802 06:23:38.766915 363393 round_trippers.go:570] HTTP Statistics: DNSLookup 0 ms Dial 1 ms TLSHandshake 24 ms ServerProcessing 13232 ms Duration 13260 ms
I0802 06:23:38.766936 363393 round_trippers.go:577] Response Headers:
I0802 06:23:38.766959 363393 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid:
I0802 06:23:38.766978 363393 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid:
I0802 06:23:38.766997 363393 round_trippers.go:580] Content-Length: 122
I0802 06:23:38.767026 363393 round_trippers.go:580] Date: Wed, 02 Aug 2023 06:23:38 GMT
I0802 06:23:38.767046 363393 round_trippers.go:580] Audit-Id: cce345e8-2efc-42fa-a145-6ff5bbfe6a06
I0802 06:23:38.767080 363393 round_trippers.go:580] Cache-Control: no-cache, private
I0802 06:23:38.767109 363393 round_trippers.go:580] Content-Type: application/json
I0802 06:23:38.767274 363393 request.go:1188] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"etcdserver: request timed out","code":500}
I0802 06:23:38.768138 363393 helpers.go:246] server response object: [{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "etcdserver: request timed out",
"code": 500
}]
Error from server: etcdserver: request timed out
在我打开另一个主机之前,永远不会获得有关配置、pod 或其他任何内容的任何信息,然后两个集群都可以查看和管理集群上的所有内容(无论是哪一个)。
据说,当一些人跌倒时,其他人可以控制住,你可以管理,但我不明白这里的问题在哪里,不要发生这种情况。
额外:当服务器崩溃时,另一个服务器会获得 2 个进程(etcdserver 和 kube-apiserver),每个进程的使用率为 500%。
每个worker、master和haproxy都有DNS解析IP和连接,我不知道到底要尝试什么。