Upgrading Rook and Ceph
The following sections describe how an existing rook-based Ceph cluster can be updated.
Ansible role rook_v2
(Helm-based installation)
The rook_v2 role should support any arbitrary helm chart version. We tested it both on bare metal and on OpenStack up to rook v1.13.
Warning
If you’re running on bare metal, prior to the upgrade to rook v1.8 you must set a custom ceph version as the one used by default contains the following bug which will fry your cluster: Ceph Bug #55970 A version known to work with rook v1.8 is Ceph v16.2.13.
A word of warning / Things to be considered
Warning
Upgrading a Rook cluster is not without risk. There may be unexpected issues or obstacles that damage the integrity and health of your storage cluster, including data loss. Only proceed with this guide if you are comfortable with that.
The Rook cluster’s storage may be unavailable for short periods during the upgrade process for both Rook operator updates and for Ceph version updates.
Rook upgrades can only be performed from any official minor release to
the next minor release. This means you can only update from
e.g. v1.2.* --> v1.3.*
, v1.3.* --> v1.4.*
, etc.
Downgrades are theoretically possible, but we do not (want to) cover automated downgrades.
How to update an existing Cluster
The rook version to be deployed can be defined in your managed-k8s
cluster configuration via the variable version
in the
[k8s-service-layer.rook]
section.
If not explicitly defined, the latest version
which has currently been tested is used.
Steps to perform an upgrade
Make sure you have read this document and checked the
Considerations
section in the Rook Upgrade Docs. (Please select your target version on the Documentation page)Determine which rook version is currently deployed. It should be the currently configured rook version in your managed YAOOK/K8s cluster configuration file. To be sure, you can check the actual deployed version with the following commands:
$ # Determine the actual rook-ceph-operator Pod name $ POD_NAME=$(kubectl -n rook-ceph get pod \ -o custom-columns=name:.metadata.name --no-headers \ | grep rook-ceph-operator) $ # Get the configured rook version $ kubectl -n rook-ceph get pod ${POD_NAME} \ -o jsonpath='{.spec.containers[0].image}'
(Optional, but informative)
Determine which ceph version is currently deployed:
$ kubectl -n rook-ceph get CephCluster rook-ceph \ -o jsonpath='{.spec.cephVersion.image}'
Depending on the currently deployed rook version, determine the next (supported) minor release. The managed YAOOK/K8s cluster configuration template states all supported versions. If in doubt, all supported rook releases are also stated in the
k8s-service-layer/rook_v1
role and at the top of this document.Set
version
in the rook configuration section to the next (supported) minor release of rook.[...] [k8s-service-layer.rook] [...] # Currently we support the following rook versions: # v1.2.3, v1.3.11, v1.4.9, v1.5.12, v1.6.7, v1.7.11 version = "v1.6.7" [...]
Apply the k8s-supplements or at least the
rook_v2
role.Note
As the upgrade is disruptive (at least for a short amount of time) > disruption needs to be enabled.
$ # Trigger k8s-supplements $ MANAGED_K8S_RELEASE_THE_KRAKEN=true bash managed-k8s/actions/apply-k8s-supplements.sh $ # Trigger only rook $ AFLAGS='--diff --tags rook' MANAGED_K8S_RELEASE_THE_KRAKEN=true bash managed-k8s/actions/apply-k8s-supplements.sh
Get yourself your favorite (non-alcoholic) drink and watch with fascinating enthusiasm how your rook-based ceph cluster gets upgraded. (Can take several minutes (up to hours)).
After the upgrade has been proceeded, check that your managed-k8s cluster still is in a sane state via the smoke tests.
$ bash managed-k8s/actions/test.sh
Continue with steps
{1,3..10}
until you have reached your final target rook version.Celebrate that everything worked out
ᕕ( ᐛ )ᕗ
Updating rook manually
Currently, there is only one major release of rook.
Updating rook to a new patch version is fairly easy and fully automated
by rook itself. You can simply patch the image version of the
rook-ceph-operator
.
$ # Example for the update of rook
$ # to a new (fictional) patch version of v1.7.*
$ kubectl -n rook-ceph set image deploy/rook-ceph-operator rook-ceph-operator=rook/ceph:v1.7.42
Updating rook to a new minor release usually requires additional steps. These steps are described in the corresponding upgrade section of the rook Docs.
Updating ceph manually
Updating ceph is fully automated by rook. As long as the currently
deployed rook-ceph-operator
supports the configured ceph version,
the operator will perform the update without the need of further
intervention Just ensure that the ceph version really is supported by
the currently deployed rook version.
$ # Example for the update of ceph to
$ # a new (fictional) release v17.2.42
$ kubectl -n rook-ceph patch CephCluster rook-ceph --type=merge -p "{\"spec\": {\"cephVersion\": {\"image\": \"ceph/ceph:v17.2.42\"}}}"
Adding/Implementing support for a new rook/ceph release to managed-k8s
Adding support for a new rook or ceph release may be accomplished by the following steps.
Adding support for a new rook release
Check for new releases in the
rook Github repository.
Read the corresponding upgrade page at the
rook Docs.
Especially check the Considerations
section there.
Most upgrade steps will be taken care of by Helm
In case any changes need to be made to the values of one of the charts, place them inside an if block, e.g.:
{% if rook_version[1:] is version('1.9', '>=') %} createPrometheusRules: true {% endif %}
If necessary, implement any additional steps described in the rook Docs
Please also include the cluster health verification task prior and subsequent to the actual upgrade steps. As the
ceph status
update can slightly differ from release to release, you may need to adjust the cluster health verification tasks. You have to ensure backwards compatibility when adjusting these tasks.
Make sure your implemented upgrade tasks are included at the right place and under the correct circumstances in
version_checks.yaml
Test your changes
Configure the new rook version in your managed YAOOK/K8s cluster configuration
Make sure the correct upgrade tasks are included
The
rook-ceph-operator
logs are very helpful to observe the upgradeExecute the smoke tests
Adding support for a new ceph release
If you notice that a new ceph release is available, I do not recommend
modifying/updating the mapped ceph version of an already existing rook
release in k8s-config
. This would trigger existing clusters to
perform a ceph upgrade once the change is merged.
Rook is getting patch releases on a relatively frequent basis. If a new
patch version of rook is released, you can add it to the supported
releases map in k8s-config
along with the new ceph version you want
to have support for. Patch version upgrades of rook do not require
additional steps. In other words: Once a ceph release is bound to a rook
release, do not change that. This way we ensure that existing clusters
will not be accidentally upgraded (to a new ceph release).