OCM with k3d
An alternative setup using k3d to create 3 lightweight k3s clusters (1 hub + 2 spokes) without Containerlab. Run from the root directory.
Topology
graph TD
subgraph Docker_Host["Docker Host"]
subgraph k3d["k3d Clusters — rancher/k3s:v1.33.6-k3s1"]
hub["🟢 ocm-hub (Server)<br/>cluster-manager<br/>API :6443"]
s1["🔵 ocm-spoke-1 (Server)<br/>klusterlet<br/>API :6443"]
s2["🔵 ocm-spoke-2 (Server)<br/>klusterlet<br/>API :6443"]
end
hub---s1
hub---s2
end
classDef hub fill:#1b5e20,color:#fff,stroke:#2e7d32
classDef spoke fill:#0d47a1,color:#fff,stroke:#1565c0
class hub hub
class s1,s2 spoke
Each cluster runs as a single-server k3s node. The hub exposes ports 80/443 for ingress.
Prerequisites
| Tool | Purpose |
|---|---|
| k3d | Creates k3s Docker clusters |
| kubectl | Kubernetes CLI |
| clusteradm | OCM bootstrapping |
Quick Start
# From the root directory — full automated pipeline:
make ocm-demo
# Or step by step:
make ocm-create-cluster # Create 3 k3d clusters
make ocm-get-ca # Extract CA certs
make ocm-install-ocm # Initialize OCM hub
make ocm-register-spokes # Register spokes via clusteradm join
make ocm-accept # Accept managed clusters
Make Targets
| Target | Description |
|---|---|
ocm-create-cluster |
Create 3 k3d clusters (ocm-hub, ocm-spoke-1, ocm-spoke-2) |
ocm-delete-cluster |
Delete all 3 clusters |
ocm-get-ca |
Extract CA certificates from each cluster |
ocm-kubeconfigs |
Export kubeconfigs to ocm/kubeconfigs/k3d/ |
ocm-install-ocm |
Initialize OCM hub via clusteradm init |
ocm-register-spokes |
Register spoke clusters via clusteradm join |
ocm-accept |
Approve pending spoke CSRs via clusteradm accept |
ocm-label |
Label managed clusters with topology/capacity metadata |
ocm-demo |
Full pipeline: create → init → register → accept |
Cluster Config Files
ocm-hub.yaml— Hub cluster (ports 80/443, TLS SANs forhost.k3d.internal)ocm-spoke-1.yaml— First spoke clusterocm-spoke-2.yaml— Second spoke cluster
Make Targets (ArgoCD)
| Target | Description |
|---|---|
ocm-install-argocd |
Install argocd-agent hub addon + MetalLB + TLS secrets |
ocm-setup-argocd-ocm |
Wait for addon + create AppProject on hub and spokes + sample Applications |
ocm-argocd-login |
Port-forward + print ArgoCD admin credentials |
Make Targets (Monitoring)
| Target | Description |
|---|---|
ocm-deploy-monitoring |
Deploy kube-prometheus-stack on hub + spoke exporters via ArgoCD |
ArgoCD Integration
ArgoCD can be layered on the k3d OCM hub for GitOps-driven application delivery
to managed spoke clusters using the argocd-agent-addon.
This is a manual extension — not included in
make ocm-demo.
Quick Setup
# After OCM is deployed and spokes are accepted:
make ocm-install-argocd # Install argocd-agent-addon on k3d hub
make ocm-setup-argocd-ocm # Wait for addon + AppProject + guestbook Applications
Architecture
graph TB
subgraph Docker_Host["Docker Host"]
subgraph Hub["k3d-ocm-hub"]
OCM["OCM Hub<br/>cluster-manager"]
MetalLB["MetalLB<br/>172.18.0.200-172.18.0.210"]
Principal["argocd-agent-principal<br/>LoadBalancer IP: 172.18.0.201"]
GitOps["GitOpsCluster"]
Placement["Placement"]
App_Hub["guestbook Application<br/>(in namespace ocm-spoke-N)"]
end
subgraph Spoke1["k3d-ocm-spoke-1"]
Agent1["argocd-agent-agent"]
AC1["app-controller"]
Pod1["guestbook-ui pod<br/>(in guestbook ns)"]
end
subgraph Spoke2["k3d-ocm-spoke-2"]
Agent2["argocd-agent-agent"]
AC2["app-controller"]
Pod2["guestbook-ui pod<br/>(in guestbook ns)"]
end
end
Placement --> GitOps
GitOps --> Principal
Principal --> Agent1
Principal --> Agent2
App_Hub -.->|agent pushes spec| Agent1
App_Hub -.->|agent pushes spec| Agent2
Agent1 --> AC1 --> Pod1
Agent2 --> AC2 --> Pod2
classDef hub fill:#1b5e20,color:#fff,stroke:#2e7d32
classDef spoke fill:#0d47a1,color:#fff,stroke:#1565c0
classDef agent fill:#e65100,color:#fff,stroke:#ef6c00
class Hub hub
class Spoke1,Spoke2 spoke
class Principal,GitOps,Placement agent
What the scripts do
make ocm-install-argocd — Installs MetalLB and the argocd-agent hub addon:
- Deploys MetalLB with IP pool
172.18.0.200-172.18.0.210(k3d Docker network range). - Runs
clusteradm install hub-addon --names argocd-agent --namespace argocd. - Waits for principal TLS and resource-proxy TLS secrets to be created by the pull-integration-controller; creates them manually with a CA-signed certificate if the operator does not produce them within 2 minutes.
make ocm-setup-argocd-ocm — Waits for agents and creates resources:
- Waits for
ManagedClusterAddOn argocd-agent-addonto becomeAvailableon all managed clusters — OCM addon framework auto-deploys theargocd-agent-agentpod to each spoke viaManifestWork. - Waits for the principal
LoadBalancerIP from MetalLB. - Creates the
defaultAppProjecton the hub (required for managed mode). - Creates the
defaultAppProjecton each spoke — necessary because the principal pushes the Application to theargocdnamespace on spokes but does not create the referenced project. - Creates
guestbookApplication resources on the hub in each managed cluster's namespace (e.g.,ocm-spoke-1,ocm-spoke-2).
No argocd cluster add, no kubeconfig rewriting, and no Python scripts needed.
Accessing the ArgoCD UI
This port-forwards argocd-server to localhost:8080 and prints the admin password.
Multi-Cluster Monitoring
A monitoring stack can be deployed on top of OCM to collect metrics from
all clusters (hub + spokes) into a single Prometheus/Grafana instance on the hub.
Spoke exporters are deployed via direct Helm (not ArgoCD) because the ArgoCD agent
model requires Applications in spoke-namespaced scopes, which the ApplicationSet
clusterDecisionResource generator cannot produce.
Quick Start
Architecture
graph TB
subgraph Docker["Docker Host — k3d Docker Network (172.18.0.0/16)"]
subgraph Hub["k3d-ocm-hub"]
Prom["kube-prometheus-stack<br/>(Prometheus + Grafana)"]
Ingress["Traefik Ingress<br/>*.100.106.163.111.nip.io"]
end
subgraph Spoke1["k3d-ocm-spoke-1"]
NE1["node-exporter<br/>hostNetwork:9100"]
KSM1["kube-state-metrics<br/>NodePort:30101"]
end
subgraph Spoke2["k3d-ocm-spoke-2"]
NE2["node-exporter<br/>hostNetwork:9100"]
KSM2["kube-state-metrics<br/>NodePort:30101"]
end
subgraph LB["MetalLB Pool<br/>172.18.0.200-210"]
TL["Traefik LB<br/>172.18.0.200:80"]
end
end
Ingress --> TL
Prom -->|scrape 172.18.0.4:9100| NE1
Prom -->|scrape 172.18.0.4:30101| KSM1
Prom -->|scrape 172.18.0.5:9100| NE2
Prom -->|scrape 172.18.0.5:30101| KSM2
classDef hub fill:#1b5e20,color:#fff,stroke:#2e7d32
classDef spoke fill:#0d47a1,color:#fff,stroke:#1565c0
classDef lb fill:#e65100,color:#fff,stroke:#ef6c00
class Hub hub
class Spoke1,Spoke2 spoke
class TL,Ingress lb
- Hub: Full
kube-prometheus-stack(Prometheus, Grafana ClusterIP, Alertmanager, node-exporter, kube-state-metrics) - Spokes: Exporters deployed via direct
helm upgrade --installusingk3d kubeconfig get— node-exporter (hostNetwork:9100) and kube-state-metrics (NodePort 30101) - Scraping: Hub Prometheus scrapes spoke exporters directly over the shared Docker network using container IPs (
172.18.0.4,172.18.0.5) - Ingress: Traefik ingress controller at
172.18.0.200with nip.io domains:grafana.100.106.163.111.nip.io,prometheus.100.106.163.111.nip.io,alertmanager.100.106.163.111.nip.io
What make ocm-deploy-monitoring does
- Resolves spoke container IPs from Docker (
k3d-ocm-spoke-*-server-0) - Injects IPs into
additionalScrapeConfigsviased - Deploys
kube-prometheus-stackon the hub via Helm: - Grafana (ClusterIP, user:
admin, password:prom-operator) - Prometheus with additional scrape configs for spoke targets
- Alertmanager, node-exporter, kube-state-metrics
- Deploys monitoring exporters on each spoke via direct
helm upgrade --install: kube-prometheus-stackchart with onlynodeExporterandkubeStateMetricsenabled- node-exporter uses
hostNetwork: true(port 9100 on host) - kube-state-metrics uses ClusterIP service internally
- Creates a NodePort service (
30101) for kube-state-metrics on each spoke so the hub Prometheus can scrape it - Applies the hub ingress (
hub-ingress.yaml) for Grafana/Prometheus/Alertmanager via nip.io domains
Files
| File | Purpose |
|---|---|
ocm/configs/monitoring/hub-values.yaml |
Helm values for hub's kube-prometheus-stack — includes placeholder spoke Docker IPs |
ocm/configs/monitoring/hub-ingress.yaml |
Ingress for grafana/prometheus/alertmanager on *.100.106.163.111.nip.io |
ocm/configs/monitoring/spoke-values.yaml |
Helm values for spoke exporters (node-exporter + kube-state-metrics only) |
ocm/configs/monitoring/spoke-kube-state-metrics-nodeport.yaml |
NodePort service (30101) for spoke kube-state-metrics |
ocm/configs/monitoring/appset-spoke-exporters.yaml |
(Reference only) ApplicationSet — abandoned, kept for documentation |
ocm/configs/monitoring/generator-configmap.yaml |
(Reference only) ConfigMap for clusterDecisionResource generator |
ocm/configs/monitoring/placement-spoke-monitoring.yaml |
(Reference only) OCM Placement selecting spoke clusters |
Verification
# Hub monitoring pods
kubectl --kubeconfig ocm/kubeconfigs/k3d/ocm-hub -n monitoring get pods
# Spoke monitoring pods
kubectl --kubeconfig ocm/kubeconfigs/k3d/ocm-spoke-1 -n monitoring get pods
kubectl --kubeconfig ocm/kubeconfigs/k3d/ocm-spoke-2 -n monitoring get pods
# Grafana — open in browser:
echo "http://grafana.100.106.163.111.nip.io"
# Prometheus targets (should show all spoke exporters as UP):
curl -s http://prometheus.100.106.163.111.nip.io/api/v1/targets | jq '.data.activeTargets[].labels' | grep -E '"cluster"|"instance"|"job"|"health"'
Manual End-to-End Setup
The following walks through every step manually without using the make targets.
Run from the project root.
1. Create clusters
2. Extract CA certificates
3. Initialize OCM hub
make ocm-install-ocm
# Verify:
kubectl config use-context k3d-ocm-hub
kubectl get deployment -n open-cluster-management cluster-manager
4. Register spokes
make ocm-register-spokes
# Each spoke runs clusteradm join with the hub's API server address
# and a bootstrap token generated by clusteradm get token.
5. Accept spokes
make ocm-accept
# Verify:
kubectl get managedclusters
# Should show: ocm-spoke-1, ocm-spoke-2 — both with status True
6. Export kubeconfigs
make ocm-kubeconfigs
# Verify:
ls ocm/kubeconfigs/k3d/
# Should show: ocm-hub, ocm-spoke-1, ocm-spoke-2
7. Install MetalLB
k3d has no built-in load balancer. MetalLB provides LoadBalancer IPs to the principal service.
# Deploy MetalLB CRDs and controller
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.7/config/manifests/metallb-native.yaml
# Wait for the controller
kubectl -n metallb-system rollout status deployment/controller --timeout=120s
# Configure an IP pool within the k3d Docker network (172.18.0.0/16)
cat <<EOF | kubectl apply -f -
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: k3d-pool
namespace: metallb-system
spec:
addresses:
- 172.18.0.200-172.18.0.210
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: k3d-l2
namespace: metallb-system
EOF
8. Install argocd-agent hub-addon
This deploys:
- ArgoCD Operator + ArgoCD CR (which starts the principal pod)
- argocd-pull-integration-controller — orchestrates the addon lifecycle
- GitOpsCluster — links the OCM Placement to ArgoCD
- ClusterManagementAddOn — registers the addon with OCM
9. Wait for TLS secrets
The pull-integration-controller creates certificates for the principal and resource-proxy endpoints. The principal pod may fail to start if these are missing — wait for them or create manually.
# Wait for principal TLS secret
for i in $(seq 1 30); do
if kubectl -n argocd get secret argocd-agent-principal-tls &>/dev/null; then
echo "Principal TLS secret ready after ~${i}s"
break
fi
sleep 2
done
# Wait for resource-proxy TLS secret
for i in $(seq 1 30); do
if kubectl -n argocd get secret argocd-agent-resource-proxy-tls &>/dev/null; then
echo "Resource-proxy TLS secret ready after ~${i}s"
break
fi
sleep 2
done
Manual TLS fallback
If the secrets do not appear after 60 seconds, create them using the operator's CA:
CA_CERT=$(kubectl -n argocd get secret argocd-agent-ca \
-o jsonpath='{.data.ca\.crt}' | base64 -d)
echo "$CA_CERT" > /tmp/argocd-ca.crt
for NAME in argocd-agent-principal argocd-agent-resource-proxy; do
openssl req -x509 -newkey rsa:2048 -keyout /tmp/$NAME.key \
-out /tmp/$NAME.crt -days 365 -nodes \
-subj "/CN=$NAME" \
-addext "subjectAltName=DNS:$NAME,DNS:$NAME.argocd.svc"
kubectl -n argocd create secret tls $NAME-tls \
--cert=/tmp/$NAME.crt --key=/tmp/$NAME.key
done
rm -f /tmp/argocd-ca.crt /tmp/argocd-principal.* /tmp/argocd-resource-proxy.*
10. Verify principal pod
kubectl -n argocd get pod -l app.kubernetes.io/name=argocd-agent-principal
# Wait for Ready:
kubectl -n argocd wait --for=condition=Ready pod \
-l app.kubernetes.io/name=argocd-agent-principal --timeout=180s
11. Wait for the LoadBalancer IP
MetalLB assigns an external IP from the configured pool.
# Wait for LoadBalancer ingress IP
for i in $(seq 1 30); do
IP=$(kubectl -n argocd get svc argocd-agent-principal \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2>/dev/null)
if [ -n "$IP" ]; then
echo "Principal LB IP: $IP"
break
fi
sleep 2
done
Expected IP
The LoadBalancer IP should be 172.18.0.201 (the first address in the
MetalLB pool). This IP is used as the destination.server in Applications
destined for spoke clusters.
12. Verify GitOpsCluster
All conditions should be True: ServerDiscovered, RBACReady,
CACertificateReady, PrincipalCertificateReady, ClustersImported,
AddonConfigured, etc.
13. Wait for spoke agent addon
The OCM addon framework deploys the argocd-agent-agent pod to each spoke
selected by the default Placement.
# Watch addon availability
kubectl get managedclusteraddon -A -w
# Should show both spokes with Available=True
14. Create default AppProject
The managed-mode principal pushes Applications to the argocd namespace
on spoke clusters. The default AppProject must exist there, otherwise
the spoke application-controller rejects the Application.
# On the hub
kubectl -n argocd apply -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: default
namespace: argocd
spec:
clusterResourceWhitelist:
- group: '*'
kind: '*'
destinations:
- namespace: '*'
server: '*'
sourceNamespaces:
- '*'
sourceRepos:
- '*'
EOF
# On each spoke
for SPOKE in ocm-spoke-1 ocm-spoke-2; do
KUBECONFIG=ocm/kubeconfigs/k3d/$SPOKE kubectl -n argocd apply -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: default
namespace: argocd
spec:
clusterResourceWhitelist:
- group: '*'
kind: '*'
destinations:
- namespace: '*'
server: '*'
sourceNamespaces:
- '*'
sourceRepos:
- '*'
EOF
done
15. Create target namespace
for SPOKE in ocm-spoke-1 ocm-spoke-2; do
KUBECONFIG=ocm/kubeconfigs/k3d/$SPOKE kubectl create namespace guestbook --dry-run=client -o yaml | \
KUBECONFIG=ocm/kubeconfigs/k3d/$SPOKE kubectl apply -f -
done
16. Create sample Application
Create the guestbook Application in each managed cluster's namespace on the hub. The principal watches these namespaces and pushes the Application spec to the corresponding spoke agent.
# Substitute the LoadBalancer IP from step 11
LB_IP=172.18.0.201
for SPOKE in ocm-spoke-1 ocm-spoke-2; do
kubectl -n $SPOKE apply -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: guestbook
namespace: $SPOKE
spec:
project: default
source:
repoURL: https://github.com/argoproj/argocd-example-apps
targetRevision: HEAD
path: guestbook
destination:
server: https://$LB_IP:443?agentName=$SPOKE
namespace: guestbook
syncPolicy:
automated:
prune: true
selfHeal: true
EOF
done
17. Validation
After a minute, verify that the application syncs end-to-end:
# On hub — application status
kubectl get applications -A -o wide
# Both should show Sync Status: Synced, Health Status: Healthy
# On spoke-1 — pods should be running
kubectl --context k3d-ocm-spoke-1 -n guestbook get pods
# On spoke-2
kubectl --context k3d-ocm-spoke-2 -n guestbook get pods
# Agent pod on spoke-1
kubectl --context k3d-ocm-spoke-1 -n argocd get pod -l app.kubernetes.io/name=argocd-agent-agent
Troubleshooting
- App stuck at
Unknown: Check spoke app-controller logs forerror getting app project— thedefaultAppProject must exist in theargocdnamespace on the spoke. - Agent not connecting: Verify the principal's LoadBalancer IP
is reachable from spoke containers (
k3d-ocm-spoke-*share the Docker network). - Principal pod CrashLoopBackOff: Check the
argocd-agent-resource-proxy-tlssecret exists — the operator may not have created it yet. Use the manual TLS fallback from step 9. - Cannot delete and recreate Application: If you delete the
Application on the spoke manually, the agent reports the count drop
to the principal. Restart the principal pod
(
kubectl -n argocd rollout restart deployment/argocd-agent-principal) to trigger a full re-sync.