Skip to content

Design Document — bosh-lab-kvm-cauldron-c1bf

Repository: bosh-lab-kvm-cauldron-c1bf Purpose: One-command BOSH + CredHub + Concourse local lab on KVM/libvirt


Architecture Overview

┌──────────────────────────────────────────────────────┐
│  Developer Laptop (Linux, 64GB RAM, 16 threads)      │
│                                                      │
│  ┌─────────────────────────────────────────────────┐ │
│  │  KVM / libvirt                                  │ │
│  │                                                 │ │
│  │  Network: bosh-lab (10.245.0.0/24, NAT)         │ │
│  │  ┌─────────────────────────────────────────┐    │ │
│  │  │  Mgmt VM (10.245.0.2)                   │    │ │
│  │  │  Ubuntu 22.04 LTS                       │    │ │
│  │  │  ┌──────────────────────────────────┐   │    │ │
│  │  │  │  BOSH Director + CredHub + UAA   │   │    │ │
│  │  │  │  (bosh create-env, bosh-lite)    │   │    │ │
│  │  │  └──────────────────────────────────┘   │    │ │
│  │  │                                         │    │ │
│  │  │  BOSH-managed VMs (containers/VMs):     │    │ │
│  │  │  ┌────────────┐ ┌────────────────────┐  │    │ │
│  │  │  │ Concourse  │ │ User deployments   │  │    │ │
│  │  │  │ web+db+wkr │ │ (zookeeper, etc.)  │  │    │ │
│  │  │  └────────────┘ └────────────────────┘  │    │ │
│  │  └─────────────────────────────────────────┘    │ │
│  │                                                 │ │
│  └─────────────────────────────────────────────────┘ │
│                                                      │
│  Host filesystem:                                    │
│  ./state/ ─────────── 9p mount ──────── /mnt/state   │
│    ├─ vars-store.yml  (director creds)               │
│    ├─ creds/          (SSH keys, state files)        │
│    ├─ ca/             (CA certs)                     │
│    ├─ cache/          (stemcells, releases, images)  │
│    ├─ logs/           (bootstrap logs)               │
│    └─ terraform.tfstate                              │
└──────────────────────────────────────────────────────┘

Libvirt Network Design

Network: bosh-lab — a NAT network on 10.245.0.0/24.

IP Range Purpose
10.245.0.1 Gateway (libvirt host bridge)
10.245.0.2 Management VM (BOSH Director)
10.245.0.3-10.245.0.9 Reserved for future management VMs
10.245.0.10-10.245.0.50 Static IPs for BOSH deployments (Concourse, etc.)
10.245.0.51-10.245.0.254 Dynamic pool for compilation VMs

DHCP is disabled. BOSH manages all IP assignment via the CPI. The management VM gets a static IP from Terraform. The gateway provides NAT for outbound internet access (stemcell/release downloads).

DNS: The libvirt network provides local DNS resolution. Upstream DNS falls back to 8.8.8.8.

CPI Configuration

The lab uses the libvirt CPI (a2geek/libvirt-bosh-cpi v4.1) to allow BOSH to orchestrate VMs via KVM/libvirt.

How it works: 1. Terraform creates the libvirt network and management VM. 2. bosh create-env runs inside the mgmt VM, using bosh-deployment with a custom ops file that swaps the VirtualBox CPI for the libvirt CPI. 3. The CPI talks to qemu:///system to create/destroy/manage VMs. 4. BOSH-managed VMs (Concourse workers, user deployments) run as nested KVM guests inside the mgmt VM.

Ops file: manifests/director/ops/libvirt-cpi.yml overrides the CPI release, stemcell source, and cloud provider configuration.

Limitations: - The libvirt CPI is a community project, not an official Cloud Foundry CPI. It supports manual networks only (no dynamic/vip). - Disk resizing was added in v4 but may have edge cases. - Nested virtualization must be enabled on the host.

CredHub Enablement

CredHub is deployed as part of the BOSH Director via the credhub.yml ops file from bosh-deployment.

Access model: - CredHub runs on the director VM at https://10.245.0.2:8844. - Authentication is via UAA (deployed alongside, ops file uaa.yml). - The credhub-admin client is auto-generated in vars-store.yml. - BOSH deployments can reference CredHub variables using ((variable_name)) syntax.

Credential flow: 1. vars-store.yml stores all generated credentials (passwords, certs, keys). 2. This file lives on the host at ./state/vars-store.yml (persists across VM rebuilds). 3. The 9p mount makes it available inside the VM at /mnt/state/vars-store.yml. 4. credhub login uses the client secret from vars-store.

Security Exposure Defaults

Principle: local-only by default. Nothing is exposed to the LAN.

Service Bind Address Port Exposure
BOSH Director 10.245.0.2 25555 NAT network only
CredHub 10.245.0.2 8844 NAT network only
UAA 10.245.0.2 8443 NAT network only
Concourse Web 10.245.0.10 443 NAT network only
VNC (mgmt VM) 127.0.0.1 auto Localhost only

To access Concourse from the host browser: Set up an SSH tunnel:

ssh -i state/creds/mgmt_ssh -L 8443:10.245.0.10:443 bosh@10.245.0.2
Then visit https://127.0.0.1:8443.

To expose to LAN (opt-in): Modify the libvirt network mode from nat to bridge and update firewall rules. This is deliberately not automated.

Version Pinning Strategy

Every external dependency is pinned to a specific version. No latest tags.

Component Version Pin Location
bosh-cli 7.9.17 bootstrap/lib/common.sh, bootstrap/remote/install-tools.sh
credhub-cli 2.9.53 bootstrap/lib/common.sh, bootstrap/remote/install-tools.sh
bosh-deployment commit faf834a bootstrap/lib/common.sh, bootstrap/remote/create-director.sh
libvirt CPI v4.1 manifests/director/ops/libvirt-cpi.yml
Concourse release 8.0.1 manifests/concourse/concourse.yml
Stemcell ubuntu-jammy/1.1044 manifests/concourse/concourse.yml, cloud-config
fly CLI 8.0.1 bootstrap/remote/install-tools.sh
Ubuntu cloud image jammy (22.04 LTS) Makefile (downloaded to cache)
Terraform >= 1.13.0 terraform/versions.tf
terraform-provider-libvirt ~> 0.9.2 terraform/versions.tf

Update process: Change the version in the pin location, run make reset && make up && make bootstrap, verify acceptance tests pass.

Cattle Pattern

The management VM is disposable. All persistent state lives on the host at ./state/: - vars-store.yml — all director credentials - creds/director-state.json — BOSH director state file - creds/mgmt_ssh — SSH key for VM access - cache/ — downloaded stemcells, releases, cloud images - terraform.tfstate — infrastructure state

Rebuild workflow: 1. make down — destroy the VM (state preserved) 2. make up — recreate the VM 3. make bootstrap — converge director using existing state

The director will be re-created from vars-store.yml, preserving all credentials and deployments.