yk8s - Introduction
This project uses Ansible to provide a customizable, highly available, scalable and flexible kubeadm-based k8s cluster installation and lifecycle-management on top of OpenStack or bare-metal.
Hint
If you want to get your cluster up and running, the Quick Start Guide is a good place to begin.
Main Feature Selling Points
can be deployed on OpenStack and on bare metal
on OpenStack, self-developed Load-Balancing-as-a-Service solution (no Octavia)
Nvidia GPU and vGPU Support
Prometheus-based Monitoring Stack
Rook-based Ceph Storage
NGINX Ingress Controller
Cert-Manager
Network Policies Support
Architecture Overview
There are four kinds of host nodes:
Type |
Short Description |
---|---|
Frontend Node |
The frontend nodes act as entry point to the Kubernetes Cluster. They are highly available, support load-balancing and act as a firewall. |
Control Plane Node |
The control plane nodes build the k8s control plane manage the (meta-)workers and the Pods in the cluster. More details can be found in the official k8s docs. |
Meta-Worker |
The meta-workers host the management application workload, e.g. of the rook storage solution or the prometheus-based monitoring stack (more details soon). |
Worker |
The workers host the user application workload. |
Note
A control plane node can also act as a frontend node.
Additional Details
Frontend Nodes
Frontend nodes are the only entry-points into the private network because
they are the only ones holding floating IPs. They do also act as SSH
jumphosts. Frontend nodes are made redundant via
keepalived.
Each frontend node hosts an instance of
HAProxy.
HAProxy acts as a load-balancing endpoint for the k8s API server.
An extra network port is used to hold both, the private and the public
virtual IP (VIP). As a health check, a script queries the /healthz
resource of HAProxy.
Both services run in docker containers for isolation.
However, they might be jailed by systemd
in the future instead.
Control Plane Nodes
The number of control plane nodes should be uneven (1,3,5, …), because k8s uses the Raft protocol. In order to prevent the split brain problem the majority of nodes has to be up with 3 control plane nodes, one can fail without problem. With two out, the last one will stop working because it does not know if this is just a network partitioning. Five nodes can handle two failed nodes.