Prometheus Stack
YAOOK/K8s uses the kube-prometheus-stack helm chart with an additional abstraction layer. To figure out the used version, you could use:
$ helm ls -n monitoring
Take a look at the values.yaml
files of
the individual helm charts to see what you can (or can`t) potentially
Note that not all values might be exposed in the
config. The data path is
config -> inventory/prometheus.yaml -> monitoring_v2 -> templates/prometheus_stack.yaml.j2
If a field that you need isn’t listed in prometheus_stack.yaml
statically configured, please
open an issue
or, even preferable,
submit a merge request :).
YAOOK/K8s developer guide can be found
YAOOK/K8s also allows the upgrade of the kube-prometheus-stack.
You can adjust the prometheus_stack_version
in the config
monitoring = {
prometheus_stack_version = "59.1.0";
If the variable isn`t set, the default will be used, which can be found via the
following call as monitoring_prometheus_stack_version
$ cat managed-k8s/k8s-service-layer/roles/monitoring_v2/defaults/main.yaml
This file also lists currently supported versions. As each upgrade requires further steps, i.e. updating the CRDs, you cannot simply jump ahead.
The upgrade routine can be triggered by running the following:
$ MANAGED_K8S_RELEASE_THE_KRAKEN=true AFLAGS="--diff -t monitoring" bash managed-k8s/actions/
By default we deploy exactly one Prometheus server if the monitoring is enabled.
This instance doesn’t have persistent storage unless prometheus_persistent_storage_class
is set in config.toml
. Prometheus scrapes all PodMonitors and ServiceMonitors
it can find in the different namespaces. This behavior can be altered by setting
in config.toml
to only scrape
resources which match a certain label set.
Prometheus can be integrated into existing Prometheus-based monitoring setups via federation
which is a pull-based approach for gathering a subset of its metrics.
The alternative way is using a push-based approach called remote write.
Remote write allows your Prometheus to actively send metrics to an endpoint (remote write receiver or target)
and is configured via [[k8s-service-layer.prometheus.remote_writes]]
in config.toml
The LCM uses the Grafana helm chart with the version that comes with the current kube-prometheus-stack helm chart version. Grafana is not enabled by default, you can enable it in the Prometheus configuration.
Custom dashboards and datasources
By default, Grafana will be rolled with two sidecars
, grafana-sc-datasources
) that are
configured to pick up additional dashboards/datasources in any
namespace. These extra resources can reside either in K8s Secrets
in ConfigMaps
. They have to have a label grafana_dashboard
To configure a custom, logical folder for one or more dashboard, add
the annotation customer-dashboards=<Folder name>
After 30s seconds of research the author came to the conclusion that one
cannot nest logical dashboard folders in Grafana. If <Folder name>
consists of a path of multiple folders, only the last one is picked.
As an example, let’s add a dashboard for the NGINX Ingress Controller
and have it displayed under the logical nginx
folder in Grafana. The
backing configmap for the dashboard should be stored in the namespace
Download the dashboard as JSON to your workstation. We will call that manifest
.Create the configmap:
kubectl create configmap nginx-db -n fancy --from-file=nginx_db.json
Add a label so that the sidecar will pick up the dashboard:
kubectl label cm -n fancy nginx-db grafana_dashboard=1
. (The value of the key/value label pair does not matter)Annotate the configmap with the proper path:
kubectl annotate cm -n fancy nginx-db customer-dashboards=nginx
The sidecar should pick up the change eventually. If it doesn’t or you’re impatient, you could restart Grafana by destroying its pod.
The monitoring stack comes with an alertmanager instance that is
available to the end user for their convenience. One can create also
their own Alertmanager
resource which is then translated into a
by the prometheus operator. AM configuration should be
kept separate and can be injected by creating a AlertmanagerConfig
resource within the monitoring
namespace. Other namespace are not
considered without further configuration. For further information please
refer to the
corresponding documentation.
A silly example:
kind: AlertmanagerConfig
name: custom-amc
namespace: monitoring
- name: your mom
- hello: localhost
requireTLS: true
- url:
key: a
name: blub
optional: true
receiver: your mom
- job
continue: false
- receiver: your mom
alertname: Watchdog
continue: false
groupWait: 30s
groupInterval: 5m
repeatInterval: 12h
The author hasn’t worked much with Alertmanager(Config)
the past and only ensured that manifests are read correctly. Their test
was looking at
$ kubectl exec -ti -n monitoring alertmanager-prometheus-stack-kube-prom-alertmanager-0 -- amtool --alertmanager.url= config
You will probably mess up the AlertmanagerConfig
manifest in
one way or another. The AdmissionController caught some typos. On other
occasions I had to look into the logs of the prometheus-operator
pod. And eventually the AM failed to come up because I missed some
further fields which I figured via the logs of the AM
Thanos is deployed outside of the kube-prometheus-stack helm chart. By default, it writes its metrics into a SWIFT object storage container that resides in the same OpenStack project.
We’re deploying the Bitnami Thanos helm chart with adjusted values by default. Please refer to its documentation for further details.
Thanos can be enabled and configured in the Prometheus configuration.
Object Storage Configuration
You can either choose the automated Thanos object storage management (default) in which case the LCM takes care to create a bucket inside your OpenStack project or you can configure a custom bucket.
Automated bucket management
The automated bucket management can only be used when your cluster is created on top of OpenStack and a valid OpenStack RC file is sourced.
This method is enabled by default. This will let Terraform create an object storage container inside your OpenStack project and automatically configures Thanos to use that container as primary storage.
Custom bucket management
The custom bucket management can be enabled by setting
k8s-service-layer.prometheus.manage_thanos_bucket = false
in your config.
You must supply a valid configuration for a supported Thanos client.
This configuration must be stored in your cluster key-value secrets engine
under kv/data/thanos-config
Inserting a Thanos client config into vault can be automated by storing the
configuration at config/thanos.yaml
(or specifying another location
in your config under
and then triggering the vault update script:
$ ./managed-k8s/tools/vault/
Alternatively, you can also manually insert your configuration into vault.
Prometheus Adapter (metrics server)
Background and motivation
The prometheus-adapter provides the metrics API by making use of existing prometheus metrics. In case of default resources (memory and cpu per pod/node), prometheus fetches these metrics from kubelet which, on the hand, reads these values from cAdvisor which gets its values from cgroups on the individual node. metrics-server gets those metrics directly from kubelet/cAdvisor.
A common use case for the metrics API is horizontal (HPA) and vertical
pod autoscaling (VPA). An advantage of prometheus-adapter compared to
metrics-server is that one can define custom metrics for HPA and VPA.
kubectl top nodes
and kubectl top pods
also needs a working
metrics API :)
As stated above, the values of the metrics API are derived stats of the cgroups on the node. kubelet creates a resource tree with the layers
QoS (Guaranteed, Burstable, BestEffort)
A sample tree:
root@managed-k8s-worker-1:/sys/fs/cgroup/unified/kubepods.slice# tree -d
├── kubepods-besteffort.slice
│ ├── kubepods-besteffort-pod1793a176_009e_4b22_9d89_6d71f914f6f7.slice
│ │ ├── docker-2dbb7f0327a157479fda466398aa87664069610232b293f5817b2712b9ff5719.scope
│ │ └── docker-51fdb8e253c7873a04db7219fb602694ad3977957a8ee354d362ce25cd29d3c8.scope
│ ├── kubepods-besteffort-pod2c9a23a5_effa_4130_aa19_5efac4829224.slice
│ │ ├── docker-817cb87c8d31136e3ef7d6274393127184b4781367bf3b9b62e572b796ebecd4.scope
│ │ └── docker-bb2c7f5087e52182667e63fc548fbb15d7981fa7322b58b59c529bbca71a8361.scope
│ ├── kubepods-besteffort-pod6aaaaf32_9f4e_46fe_841f_13bee2413625.slice
│ │ ├── docker-3a58ec66ee269a25dc14d580fd9ea4766ff6fcb269b7be39bdc08abd9c0a87f4.scope
│ │ └── docker-3ad62f52496d25dd5ef3f8b9b462776bbd7023ed1c37c56b19429b8c7b926ad6.scope
│ ├── kubepods-besteffort-podb2481109_b708_49f3_b2bb_52b0fb470fe9.slice
│ │ ├── docker-601173595b1d0d6b08b7965e28e04c83a64900e2642d3c48ff0f972019f9f556.scope
│ │ ├── docker-9edfeb7ab8ae757ffb90e847ffa70b2281e89367eca3f34d89065225e61e47ba.scope
│ │ └── docker-de4b153c2c49bb04c0b45f534694fd143d70f25b18503626d67a4fd73c016ea5.scope
│ ├── kubepods-besteffort-podb393dd5c_0c80_488b_bed1_c548aea803a3.slice
│ │ ├── docker-7f22a8b72620cd7b6d740de9957f10eed127063b64745df8b45b432d299d04f0.scope
│ │ └── docker-e3a42aca173771b1089d97ba8664d6fd04e9f5ed736a1167c75b3f71025315e9.scope
│ ├── kubepods-besteffort-podcd213409_756c_4d17_9b7b_9a9b023d8533.slice
│ │ ├── docker-ab7a790f1afbd39ffaef0ce1bdb0dbbe7b9525ad785190e498b9a68754f96c86.scope
│ │ └── docker-eac640f0373dc37d45e6d36375656db04d2b815e605d9c8b1c8a2652e1a66e65.scope
│ └── kubepods-besteffort-podeba9d649_010c_4122_bcee_27255d8ad69c.slice
│ ├── docker-087baf1b34e7a703d81cbe8a988d2eb9e0837f86b798066789436443cfea090e.scope
│ └── docker-68c2d4b2f374611a1e550b7f3b31dba3039d5c98b5d931fb87638cf0114bd9a8.scope
└── kubepods-burstable.slice
├── kubepods-burstable-pod4bbb178b_3396_49b7_90e7_6264b7392aa2.slice
│ ├── docker-5f4521bde3825fa1b35262ed377c95ce47cdd322e2f017a9a8f1083e05a8d39b.scope
│ ├── docker-6b6d47a682fc95ca0d7c37cf83f391c3d0f8bacda88eae22634b4c5dff043dbf.scope
│ └── docker-cd817ca433d294ae3701c61dab312ab5715525cf3cd8c74fc5f1471bbcde59c3.scope
├── kubepods-burstable-pod793e426b_16c6_4b86_a0b8_e4b4ed877c15.slice
│ ├── docker-7158fab7cdc1af3bc68599e8fa0cfcc637840a8a9fea65a94cc467e7836310ea.scope
│ └── docker-92a0b9788b01f2ca82792d93bbdfb90da419097c61493dcd6587fafacace1d91.scope
└── kubepods-burstable-pod81795b29_e574_4d5e_866c_ad146e86bdbb.slice
├── docker-51c9c0b1dcf6153572661b8bcb9d99ea4a4934db35e074fb88297b4b36002ace.scope
└── docker-dee4dea98d5e8e6282fe64607d3c91e3ee071d2fab2570d44eedac649702daf2.scope
34 directories
Note: /sys/fs/cgroup/unified/
is the mount point of cgroups v2 on a
Ubuntu 20.04 node. As it seems, cgroups v1 is still the default so,
i.e., information on memory usage have to be fetched from the
corresponding memory controllers.
Those values are translated into such metrics:
container_memory_working_set_bytes{container="POD", endpoint="https-metrics", id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod00376bc9_6679_4f56_a9dd_a10aad6ff2d4.slice/docker-5b9efdb04ff83031b437fde548968ef9b92c3febccb03946ec421b11d12893dd.scope", image="", instance="", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_POD_prometheus-stack-prometheus-node-exporter-z8qj7_monitoring_00376bc9-6679-4f56-a9dd-a10aad6ff2d4_0", namespace="monitoring", node="managed-k8s-worker-0", pod="prometheus-stack-prometheus-node-exporter-z8qj7", service="prometheus-stack-kube-prom-kubelet"}
container_memory_working_set_bytes{container="POD", endpoint="https-metrics", id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod01ed3a39_a5f0_4465_a33f_63645893aa1e.slice/docker-469c599d81d233dd2a1d6e1ea252ca1535df26e4c57f04451c066bf1589cc129.scope", image="", instance="", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_POD_nvidia-device-plugin-daemonset-4gbd2_kube-system_01ed3a39-a5f0-4465-a33f-63645893aa1e_0", namespace="kube-system", node="managed-k8s-worker-0", pod="nvidia-device-plugin-daemonset-4gbd2", service="prometheus-stack-kube-prom-kubelet"}
container_memory_working_set_bytes{container="POD", endpoint="https-metrics", id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod12321fde_373b_4347_ad3e_f31b4f587d35.slice/docker-2bd3158e1dc1d1911dcb294e62463c6da24517287c77eb132cf22bafe1710bc4.scope", image="", instance="", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_POD_thanos-sample-storegateway-0_monitoring_12321fde-373b-4347-ad3e-f31b4f587d35_0", namespace="monitoring", node="managed-k8s-worker-2", pod="thanos-sample-storegateway-0", service="prometheus-stack-kube-prom-kubelet"}