Rook - General information
Todo
needs updates (differentiate between on OpenStack and on Bare Metal)
We configure the cluster to use Cinder CSI volumes as backing storage. The volumes are already replicated on our OpenStack’s Ceph level, so we set the pool size to 1 (= one replica only) in the Rook cluster.
Cinder volumes are numbered in rook (e.g.
cinder-2-ceph-data
). The numbering does not always correspond to the OSD number using that volume! This is important when debugging OSD issues and when removing OSDs and their storage especially.Adding volumes can cause the worker instance to crash with a kernel panic due to a known kernel bug with a race condition when detaching the Cinder volume from the instance.
To access ceph tools, run:
$ kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
Rook stores information about the provisioning state of volumes in ConfigMap objects while the provisioning hasn’t completed yet. When removing a provisioning job half way through, it is important to also clean up the corresponding ConfigMap object, otherwise the operator will hang.
The ceph mons use the
local-storage
StorageClass which is a fancy version ofhostPath
and has the advantage of “binding” a pod to a node.local-storage
works through a controller that presents disks (or bind-mounts of directories, as in our case) as PVs to K8s. The controller also attaches PVCs to these PVs.Note that it’s usually safe to delete a ceph mon and its PVC if at least one healthy mon remains. To be on the safe side, make sure the quorum is
> floor(mons / 2).