Why can't my RabbitMQ cluster on K8s (multi-node Minikube) create its mnesia directory?

Question

I am attempting to get a (currently single-node) RabbitMQ cluster created inside of a local Minikube instance. However, there seems to be a permission issue when attempting to create the RMQ cluster on Minikube with two nodes.

Prerequisites:

Have Minikube, kubectl, and krew installed.

Steps to reproduce:

Start Minikube minikube start --memory 8192 --cpus 4 --nodes 2

😄  minikube v1.27.1 on Debian bookworm/sid
✨  Automatically selected the docker driver
📌  Using Docker driver with root privileges
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🔥  Creating docker container (CPUs=4, Memory=8192MB) ...
🐳  Preparing Kubernetes v1.25.2 on Docker 20.10.18 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass

👍  Starting worker node minikube-m02 in cluster minikube
🚜  Pulling base image ...
🔥  Creating docker container (CPUs=4, Memory=8192MB) ...
🌐  Found network options:
    ▪ NO_PROXY=192.168.49.2
🐳  Preparing Kubernetes v1.25.2 on Docker 20.10.18 ...
    ▪ env NO_PROXY=192.168.49.2
🔎  Verifying Kubernetes components...
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Install the RabbitMQ plugin to kubectl kubectl krew install rabbitmq
Install the RabbitMQ cluster operator into Minikube kubectl rabbitmq install-cluster-operator
Create a default, single-node cluster with the operator kubectl rabbitmq create default

This results in the Persistent Volume (PV), Persistent Volume Claim (PVC), Stateful Set, and Service being created. After waiting a little while for the PVC to be attached to the PV, the Pod gets created. However, I end up with the following console errors with the Pod and the Pod starts a crash loop:

2023-01-19 16:31:42.993395+00:00 [warning] <0.130.0> Failed to write PID file "/var/lib/rabbitmq/mnesia/rabbit@default-server-0.default-nodes.default.pid": permission denied
2023-01-19 16:31:43.520923+00:00 [info] <0.221.0> Feature flags: list of feature flags found:
2023-01-19 16:31:43.520982+00:00 [info] <0.221.0> Feature flags:   [ ] classic_mirrored_queue_version
2023-01-19 16:31:43.521013+00:00 [info] <0.221.0> Feature flags:   [ ] implicit_default_bindings
2023-01-19 16:31:43.521060+00:00 [info] <0.221.0> Feature flags:   [ ] maintenance_mode_status
2023-01-19 16:31:43.521087+00:00 [info] <0.221.0> Feature flags:   [ ] quorum_queue
2023-01-19 16:31:43.521118+00:00 [info] <0.221.0> Feature flags:   [ ] stream_queue
2023-01-19 16:31:43.521147+00:00 [info] <0.221.0> Feature flags:   [ ] user_limits
2023-01-19 16:31:43.521186+00:00 [info] <0.221.0> Feature flags:   [ ] virtual_host_metadata
2023-01-19 16:31:43.521204+00:00 [info] <0.221.0> Feature flags: feature flag states written to disk: yes
2023-01-19 16:31:43.688848+00:00 [notice] <0.44.0> Application syslog exited with reason: stopped
2023-01-19 16:31:43.689010+00:00 [notice] <0.221.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
2023-01-19 16:31:43.697297+00:00 [notice] <0.221.0> Logging: configured log handlers are now ACTIVE
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0> 
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0> BOOT FAILED
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0> ===========
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0> Error during startup: {error,
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0>                           {cannot_create_mnesia_dir,
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0>                               "/var/lib/rabbitmq/mnesia/rabbit@default-server-0.default-nodes.default/",
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0>                               eacces}}
2023-01-19 16:31:43.715974+00:00 [error] <0.221.0> 
BOOT FAILED
===========
Error during startup: {error,
                          {cannot_create_mnesia_dir,
                              "/var/lib/rabbitmq/mnesia/rabbit@default-server-0.default-nodes.default/",
                              eacces}}
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>   crasher:
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     initial call: application_master:init/4
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     pid: <0.220.0>
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     registered_name: []
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     exception exit: {{cannot_create_mnesia_dir,
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>                          "/var/lib/rabbitmq/mnesia/rabbit@default-server-0.default-nodes.default/",
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>                          eacces},
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>                      {rabbit,start,[normal,[]]}}
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>       in function  application_master:init/4 (application_master.erl, line 142)
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     ancestors: [<0.219.0>]
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     message_queue_len: 1
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     messages: [{'EXIT',<0.221.0>,normal}]
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     links: [<0.219.0>,<0.44.0>]
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     dictionary: []
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     trap_exit: true
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     status: running
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     heap_size: 987
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     stack_size: 29
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>     reductions: 158
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0>   neighbours:
2023-01-19 16:31:44.716751+00:00 [error] <0.220.0> 
2023-01-19 16:31:44.720886+00:00 [notice] <0.44.0> Application rabbit exited with reason: {{cannot_create_mnesia_dir,"/var/lib/rabbitmq/mnesia/rabbit@default-server-0.default-nodes.default/",eacces},{rabbit,start,[normal,[]]}}
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{cannot_create_mnesia_dir,\"/var/lib/rabbitmq/mnesia/rabbit@default-server-0.default-nodes.default/\",eacces},{rabbit,start,[normal,[]]}}}"} 
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{cannot_create_mnesia_dir,"/var/lib/rabbitmq/mnesia/rabbit@default-server-0.default-nodes.default/",eacces},{rabbit,start,[normal,[]]}}}) 
 
Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done

However , if I delete my Minikube ( minikube stop && minikube delete ) and re-create it with a single node ( minikube start --memory 8192 --cpus 4 --nodes 1 ) and follow the aforementioned steps to create the default RabbitMQ cluster, then there are no problems. I don't understand how adding a second node to Minikube could cause this issue.

I feel like I'm just missing something obvious, but not sure what it is.

Any sort of suggestions or feedback would be greatly appreciated. Please let me know if there are more details I should provide. Thank you in advance!

Answer 1

In typical "me" fashion, I finally found the correct phrase to search for and get results. It turns out that the storage provisioner bundled with minikube doesn't really work with 2+ nodes. Replacing it with another one (kubevirt) as explained by this GitHub issue comment allows the Pod to spin up correctly.

To add more context, in case that link goes bad someday: I created a kubevirt-hostpath-provisioner.yaml file (contents given below) and then replaced the storage provisioner in minikube:

minikube addons disable storage-provisioner
kubectl delete storageclass standard
kubectl apply -f kubevirt-hostpath-provisioner.yaml

# kubevirt-hostpath-provisioner.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubevirt.io/hostpath-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubevirt-hostpath-provisioner
subjects:
  - kind: ServiceAccount
    name: kubevirt-hostpath-provisioner-admin
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: kubevirt-hostpath-provisioner
  apiGroup: rbac.authorization.k8s.io
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kubevirt-hostpath-provisioner
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]

  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]

  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kubevirt-hostpath-provisioner-admin
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kubevirt-hostpath-provisioner
  labels:
    k8s-app: kubevirt-hostpath-provisioner
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: kubevirt-hostpath-provisioner
  template:
    metadata:
      labels:
        k8s-app: kubevirt-hostpath-provisioner
    spec:
      serviceAccountName: kubevirt-hostpath-provisioner-admin
      containers:
        - name: kubevirt-hostpath-provisioner
          image: quay.io/kubevirt/hostpath-provisioner
          imagePullPolicy: Always
          env:
            - name: USE_NAMING_PREFIX
              value: "false" # change to true, to have the name of the pvc be part of the directory
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: PV_DIR
              value: /tmp/hostpath-provisioner
          volumeMounts:
            - name: pv-volume # root dir where your bind mounts will be on the node
              mountPath: /tmp/hostpath-provisioner/
              #nodeSelector:
              #- name: xxxxxx
      volumes:
        - name: pv-volume
          hostPath:
            path: /tmp/hostpath-provisioner/

Why can't my RabbitMQ cluster on K8s (multi-node Minikube) create its mnesia directory?

Question

1 answers

solution1
1 2023-01-19 17:01:41

Why can't my RabbitMQ cluster on K8s (multi-node Minikube) create its mnesia directory?

Question

1 answers

solution1 1 2023-01-19 17:01:41

solution1
1 2023-01-19 17:01:41