minikube 上的 RabbitMQ 集群 Kubernetes Operator

问题描述 投票:0回答:5

我正在尝试使用 RabbitMQ Cluster Operator:

在 Minikube 上设置 RabbitMQ

当我尝试附加持久卷时,出现以下错误:

$ kubectl logs -f rabbitmq-rabbitmq-server-0

Configuring logger redirection
20:04:40.081 [warning] Failed to write PID file "/var/lib/rabbitmq/mnesia/rabbit@rabbitmq-rabbitmq-server-0.rabbitmq-rabbitmq-headless.default.pid": permission denied
20:04:40.264 [error] Failed to create Ra data directory at '/var/lib/rabbitmq/mnesia/rabbit@rabbitmq-rabbitmq-server-0.rabbitmq-rabbitmq-headless.default/quorum/rabbit@rabbitmq-rabbitmq-server-0.rabbitmq-rabbitmq-headless.default', file system operation error: enoent
20:04:40.265 [error] Supervisor ra_sup had child ra_system_sup started with ra_system_sup:start_link() at undefined exit with reason {error,"Ra could not create its data directory. See the log for details."} in context start_error
20:04:40.266 [error] CRASH REPORT Process <0.247.0> with 0 neighbours exited with reason: {error,"Ra could not create its data directory. See the log for details."} in ra_system_sup:init/1 line 43
20:04:40.267 [error] CRASH REPORT Process <0.241.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,ra_system_sup,{error,"Ra could not create its data directory. See the log for details."}}},{ra_app,start,[normal,[]]}} in application_master:init/4 line 138
{"Kernel pid terminated",application_controller,"{application_start_failure,ra,{{shutdown,{failed_to_start_child,ra_system_sup,{error,\"Ra could not create its data directory. See the log for details.\"}}},{ra_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,ra,{{shutdown,{failed_to_start_child,ra_system_sup,{error,"Ra could not create its data directory. See the log for details."}

Crash dump is being written to: erl_crash.dump...

问题在于,由于缺乏权限,RabbitMQ 无法在数据目录中设置其数据文件

/var/lib/rabbitmq/mnesia

我最初的猜测是我需要将数据目录指定为volumeMount,但这似乎无法根据文档进行配置。

RabbitMQ 的关于持久性的故障排除文档导致404

我尝试在网上查找其他有同样问题的资源,但没有一个使用 RabbitMQ Cluster Operator。如果我找不到解决方案,我计划遵循该路线,但我真的很想以某种方式解决这个问题。

有人有什么想法吗?

我的设置如下:

apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: rabbitmq
spec:
  replicas: 1
  service:
    type: NodePort
  persistence:
    storageClassName: local-storage
    storage: 20Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rabbitmq-pvc
spec:
  storageClassName: local-storage
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbitmq-pv
spec:
  storageClassName: local-storage
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 20Gi
  hostPath:
    path: /mnt/app/rabbitmq
    type: DirectoryOrCreate

要在 minikube 上重现此问题:

  1. 安装rabbitmq操作员:
kubectl apply -f "https://github.com/rabbitmq/cluster-operator/releases/latest/download/cluster-operator.yml"
  1. 应用上面的清单文件
kubectl apply -f rabbitmq.yml
  1. 运行

    kubectl get po
    显示名为
    rabbitmq-rabbitmq-server-0
    的 Pod。

  2. 运行

    kubectl logs -f rabbitmq-rabbitmq-server-0
    查看日志显示上述错误。

kubernetes rabbitmq
5个回答
4
投票

正如我在评论中建议的那样,您可以运行来解决它:

minikube ssh -- sudo chmod g+w /mnt/app/rabbitmq/ 

回答你的问题:

有没有办法可以将其添加到我的清单文件中,而不必手动执行?

您可以覆盖rabbitmq statefulset清单字段,将initContainer命令脚本中的最后一行从

 chgrp 999 /var/lib/rabbitmq/mnesia/
更改为:
chown 999:999 /var/lib/rabbitmq/mnesia/

完整的 RabbitmqCluster yaml 清单如下所示:

apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: rabbitmq
spec:
  replicas: 1
  service:
    type: NodePort
  persistence:
    storageClassName: local-storage
    storage: 20Gi
  override:
    statefulSet:
      spec:
        template:
          spec:
            containers: []
            initContainers:
            - name: setup-container
              command:
              - sh
              - -c
              - cp /tmp/rabbitmq/rabbitmq.conf /etc/rabbitmq/rabbitmq.conf && chown 999:999
                /etc/rabbitmq/rabbitmq.conf && echo '' >> /etc/rabbitmq/rabbitmq.conf ; cp /tmp/rabbitmq/advanced.config
                /etc/rabbitmq/advanced.config && chown 999:999 /etc/rabbitmq/advanced.config
                ; cp /tmp/rabbitmq/rabbitmq-env.conf /etc/rabbitmq/rabbitmq-env.conf && chown
                999:999 /etc/rabbitmq/rabbitmq-env.conf ; cp /tmp/erlang-cookie-secret/.erlang.cookie
                /var/lib/rabbitmq/.erlang.cookie && chown 999:999 /var/lib/rabbitmq/.erlang.cookie
                && chmod 600 /var/lib/rabbitmq/.erlang.cookie ; cp /tmp/rabbitmq-plugins/enabled_plugins
                /etc/rabbitmq/enabled_plugins && chown 999:999 /etc/rabbitmq/enabled_plugins
                ; chown 999:999 /var/lib/rabbitmq/mnesia/ # <- CHANGED THIS LINE

0
投票

我在 Vagrant 内的 kubernetes 中部署 RabbitMQ 时遇到了同样的问题(尽管不是 minikube)。我正在使用this设置。 我尝试跑步

sudo chmod g+w /mnt/app/rabbitmq/
但没有运气...... 最终放弃并最终使用 this box 在 vagrant 中运行 minikube,开箱即用,一切都运行得很好!不需要做任何特别的事情...不是手动创建持久卷的事件...

在我的节点内


0
投票

我在实时版本中遇到了这个问题,并且 minikube 不允许运行 SSH 命令。所以我所做的就是对我的主机路径配置程序运行 chmod 并重新创建我的rabbitmq-cluster

chmod 777 /tmp/hostpath-provisioner/default/*

0
投票

我找到了这个问题的答案。当集群中节点很少时会发生这种情况。 解决方案是添加

securityContext: {}

https://github.com/rabbitmq/rabbitmq-website/blob/3ee8e72a7c4fe52e323ba1039eecbf3a67c554f7/site/kubernetes/operator/using-on-openshift.md#任意-user-ids


0
投票

@Koe Kaverna 的回答是正确的。这对于我的 K8s 集群也有用

© www.soinside.com 2019 - 2024. All rights reserved.