Skip to content
/ setup Public

ansible + terraform setup base for lab environment

Notifications You must be signed in to change notification settings

smemsh/setup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

799 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Setup

Kubernetes lab cluster demo, provisioned by Terraform, running on Incus (LXD) servers. Kube node images baked by Ansible. Flux brings the workload online via GitOps.

Control plane bootstrap, through workload instantiation, are all defined in the setup repository, so the cluster can be repeatedly re-initialized from a checkout. This facilitates development of the platform itself.

Status

Up to 64 worker nodes have been tested, but a handful is more typical. The point is to implement a control plane and multiple workers in a full Kubernetes stack, on real Terraformed infrastructure, yet hosted by one or a small number of machines. Instead of using more limited test environments like Minikube, Kind, etc.

Notes:

  • only used for author's demonstration purposes
  • only one master control node implemented (for each plex)
  • some functionality unpushed or encrypted (no external users)
  • must pre-bake all os-base/virt-type/host-type combinations
  • only one bake in progress at a time per plex

Primary development plex is called omniplex and the staging one is called verniplex. In the future these may be joined together with Cilium's ClusterMesh feature.

Overview

Ansible is used to bake and upload Kubernetes node base images in lxdbake.yml (via bin/lxdbake), using transient Terraformed instances (created when baketime variable is set).

Bake and deploy roles and tasks for each node type are specified in types/ and roles/. Cluster configuration is described in terraform.tfvars and ansible vars/ files. The plexhocs variable is a nested map that defines the global shape of all clusters (ie each "plex").

Terraform provisions the defined Incus node instances for each plex, in the right order. Upon node deployment, kubestrap.yml brings up the Kubernetes control plane, or joins a worker to the candidate pool. Work capacity then triggers Flux install via fluxstrap.yml, which implements the data plane. Kubernetes objects are typically defined in HelmRelease values in apps/, and a Kustomize build specified in clusters/ includes these for each plex.

The first node in a plex's node range (e.g. testplex1) must always be defined in plexhocs if any kube nodes are defined, and will be provisioned as the control-plane master. Multiple/arbitrary masters are planned, but don't work yet.

Clusters are recreated by dependency cascade from the master, i.e. tfapply -replace=<masternode> re-creates the whole cluster.

Infrastructure

The Kubernetes nodes are privileged Incus system containers, fully functional nodes running stock kube-apiserver or kubelet processes. Despite the virtualized kernels, everything works, including BPF networking with Cilium. Nodes are incus_instance resources created with some LXC config magic.

Upstream OCI images are read-through cached by a standalone Zot container installed via Docker OCI registry, also an incus_instance. This registry can also be used to store CI pipeline build outputs.

Cluster host data is static and precomputed, so all hostfiles, networking config and hostkeys for the "plex" are either baked into the node images or given during node instantiation by cloud-init. Adding another "plex" amounts to adding another block of precomputed hostnames and keys.

Bootstrap is directed by a user on an Ansible controller that has a checked out repository with unlocked secrets.

Contents

directories:

terraform/ HCL defining cluster infrastructure on Incus nodes
roles/ ansible roles to apply onto nodes, image bake/config
types/ which roles/actions to apply onto different node types
clusters/ flux kustomizations defining each cluster workload
apps/ kubernetes application manifests, etc

plays:

lxdbake.yml builds node images by node type and os base
kubestrap.yml kubernetes control plane bootstrap
fluxstrap.yml kubernetes data plane bootstrap
kubedel.yml drain and remove node from kubernetes cluster

scripts:

bin/ansrole applies ansible roles onto nodes, with several options
bin/tfapply wraps terraform apply with more concise output/summary
bin/tfcloudinit interact with ansible to get cloudinits at baketime
bin/tfhosts imports nsswitch hosts into terraform as data source
bin/tfdeps list dependencies of terraform objects
bin/lxdbake bake kube/standalone images of given node type for incus
bin/mkgenders genders format from ansible inventory conversion script
bin/genders genders format to ansible inventory conversion script
callbacks/unixz.py more concise fork of ansible unixy filter
filters/ghreltags.py ansible filter to get latest release tag from github

Howto

TBD -- Submit an issue if interested.

About

ansible + terraform setup base for lab environment

Resources

Stars

Watchers

Forks