K3s 节点不断失败,我不确定是什么原因导致的

问题描述 投票:0回答:1

所以我已经在一堆树莓派 4 和 3 上运行 k3s 一段时间了,但一个节点总是失败。这是我的设置

服务器:Raspberry pi 4-8GB 和 2x Raspberry pi 4-4GB 工人:3 个 Raspberry pi 3Bs

我的第二个树莓派 4 - 4GB 一直出现故障。所有 6 个节点均连接到 GB 以太网和三星 500 GB SSD

我正在运行一个 HA 环境,外部 sql 数据库连接到我的 NAS (192.168.1.200),如下所示。

以下是我遇到的一些错误

当我运行

Sudo systemctl status k3s
时,我得到(出于隐私原因更改了用户/密码):

    k3s.service - Lightweight Kubernetes
   Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Sun 2021-07-04 01:59:29 BST; 609ms ago
     Docs: https://k3s.io
  Process: 14637 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
  Process: 14639 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
  Process: 14640 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
  Process: 14641 ExecStart=/usr/local/bin/k3s server --tls-san 192.168.1.100 --datastore-endpoint mysql://user:pass@tcp(192.168.1.200:3306)/k3s --disable traefik (code=exited, status=1/FAILURE)
 Main PID: 14641 (code=exited, status=1/FAILURE)

当我跑步时

journalctl -xe
我得到:

-- Subject: A start job for unit k3s.service has begun execution
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- 
-- A start job for unit k3s.service has begun execution.
-- 
-- The job identifier is 325190.
Jul 04 02:13:46 node50 sh[18386]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Jul 04 02:13:46 node50 sh[18386]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
Jul 04 02:13:47 node50 k3s[18390]: time="2021-07-04T02:13:47.365627432+01:00" level=info msg="Starting k3s v1.19.12+k3s1 (559d0c47)"
Jul 04 02:13:47 node50 k3s[18390]: time="2021-07-04T02:13:47.366403382+01:00" level=info msg="Cluster bootstrap already complete"
Jul 04 02:13:47 node50 k3s[18390]: time="2021-07-04T02:13:47.418216205+01:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: Error 1129: Host is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'"
Jul 04 02:13:47 node50 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- 
-- An ExecStart= process belonging to unit k3s.service has exited.
-- 
-- The process' exit code is 'exited' and its exit status is 1.
Jul 04 02:13:47 node50 systemd[1]: k3s.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- 
-- The unit k3s.service has entered the 'failed' state with result 'exit-code'.
Jul 04 02:13:47 node50 systemd[1]: Failed to start Lightweight Kubernetes.
-- Subject: A start job for unit k3s.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- 
-- A start job for unit k3s.service has finished with a failure.

根据此错误,我已登录 MySQL 实例并运行

mysqladmin flush-hosts
,它修复了问题几个小时,然后又发生了。所以我有点不明白为什么这个问题只发生在一个 Pi 上。否则 Pi 工作正常。可以毫无问题地运行 docker 和其他程序。

我还将 my.cnf 中的最大连接数从 100 增加到 10000

有人有什么想法吗?

mysql kubernetes networking raspberry-pi4 k3s
1个回答
0
投票
例如,只需尝试重新安装您的 k3s 版本 卷曲-sfL

https://get.k3s.io | INSTALL_K3S_CHANNEL=v1.27.7+k3s2 sh - 适用于版本 v1.27.7+k3s2

© www.soinside.com 2019 - 2024. All rights reserved.