First cut at ceph docs

Signed-off-by: David Young <davidy@funkypenguin.co.nz>
2026-01-27 07:27:16 +00:00 · 2022-08-18 12:44:50 +12:00
parent 80c8f491a1
commit d23be9b7eb
3 changed files with 432 additions and 9 deletions
--- a/_snippets/premix-cta-kubernetes.md
+++ b/_snippets/premix-cta-kubernetes.md
@@ -1,4 +1,6 @@
 !!! tip "Fast-track your fluxing! 🚀"
    Is crafting all these YAMLs by hand too much of a PITA?
    
-    I automatically and **instantly** share (_with my [sponsors](https://github.com/sponsors/funkypenguin)_) a private "[_premix_](https://geek-cookbook.funkypenguin.co.nz/premix/)" git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository! :thumbsup:
+    I automatically and **instantly** share (_with my [sponsors](https://github.com/sponsors/funkypenguin)_) a private "[_premix_](https://geek-cookbook.funkypenguin.co.nz/premix/)" git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository, for each chosen recipe! 
+    
+    Let the machines :material-robot-outline: do the TOIL! :man_lifting_weights:
--- a/manuscript/kubernetes/persistence/rook-ceph/cluster.md
+++ b/manuscript/kubernetes/persistence/rook-ceph/cluster.md
@@ -1 +1,422 @@
-Working on this, check back soon! ;)
+---
+title: Deploy Rook Ceph's operater-managed Cluster for Persistent Storage in Kubernetes
+description: Step #2 - Having the operator available, now we deploy the ceph cluster itself
+---
+
+# Persistent storage in Kubernetes with Rook Ceph / CephFS - Cluster
+
+[Ceph](https://docs.ceph.com/en/quincy/) is a highly-reliable, scalable network storage platform which uses individual disks across participating nodes to provide fault-tolerant storage.
+
+[Rook](https://rook.io) provides an operator for Ceph, decomposing the [10-year-old](https://en.wikipedia.org/wiki/Ceph_(software)#Release_history), at-time-arcane, platform into cloud-native components, created declaratively, whose lifecycle is managed by an operator.
+
+In the [previous recipe](/kubernetes/persistence/rook-ceph/operator/), we deployed the operator, and now to actually deploy a Ceph cluster, we need to deploy a custom resource (*a "CephCluster"*), which will instruct the operator on we'd like our cluster to be deployed.
+
+We'll end up with multilpe storageClasses which we can use to allocate storage to pods from either Ceph RBD (*block storage*), or CephFS (*a mounted filesystem*). In many cases, CephFS is a useful choice, because it can be mounted from more than one pod **at the same time**, which makes it suitable for apps which need to share access to the same data ([NZBGet][nzbget], [Sonarr][sonarr], and [Plex][plex], for example)
+
+
+## Rook Ceph Cluster requirements
+
+!!! summary "Ingredients"
+
+    Already deployed:
+
+    * [x] A [Kubernetes cluster](/kubernetes/cluster/)
+    * [x] [Flux deployment process](/kubernetes/deployment/flux/) bootstrapped
+    * [x] Rook Ceph's [Operator](/kubernetes/persistence/rook-ceph/operator/)
+
+## Preparation
+
+### Namespace
+
+We already deployed a `rook-ceph` namespace when deploying the Rook Ceph [Operator](/kubernetes/persistence/rook-ceph/operator/), so we don't need to create this again :thumbsup: [^1]
+
+### HelmRepository
+
+Likewise, we'll install the `rook-ceph-cluster` helm chart from the same Rook-managed repository as we did the `rook-ceph` (operator) chart, so we don't need to create a new HelmRepository.
+
+### Kustomization
+
+We do, however, need a separate Kustomization for rook-ceph-cluster, telling flux to deploy any YAMLs found in the repo at `/rook-ceph-cluster`. I create this example Kustomization in my flux repo:
+
+!!! question "Why a separate Kustomization if both are needed for rook-ceph?"
+    While technically we **could** use the same Kustomization to deploy both `rook-ceph` and `rook-ceph-cluster`, we'd run into dependency issues. It's simpler and cleaner to deploy `rook-ceph` first, and then list it as a dependency for `rook-ceph-cluster`.
+
+```yaml title="/bootstrap/kustomizations/kustomization-rook-ceph-cluster.yaml"
+apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
+kind: Kustomization
+metadata:
+  name: rook-ceph-cluster--rook-ceph
+  namespace: flux-system
+spec:
+  dependsOn: 
+  - name: "rook-ceph" # (1)!
+  interval: 30m
+  path: ./rook-ceph-cluster
+  prune: true # remove any elements later removed from the above path
+  timeout: 10m # if not set, this defaults to interval duration, which is 1h
+  sourceRef:
+    kind: GitRepository
+    name: flux-system
+  validation: server
+```
+
+1. Note that we use the `spec.dependsOn` to ensure that this Kustomization is only applied **after** the rook-ceph operator is deployed and operational. This ensures that the necessary CRDs are in place, and avoids a dry-run error on the reconciliation.
+
+--8<-- "premix-cta-kubernetes.md"
+
+### ConfigMap
+
+Now we're into the app-specific YAMLs. First, we create a ConfigMap, containing the entire contents of the helm chart's [values.yaml](https://github.com/rook/rook/blob/master/deploy/charts/rook-ceph/values.yaml). Paste the values into a `values.yaml` key as illustrated below, indented 4 tabs (*since they're "encapsulated" within the ConfigMap YAML*). I create this example yaml in my flux repo:
+
+```yaml title="/rook-ceph-cluster/configmap-rook-ceph-cluster-helm-chart-value-overrides.yaml"
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: rook-ceph-cluster-helm-chart-value-overrides
+  namespace: rook-ceph-cluster
+data:
+  values.yaml: |-  # (1)!
+    # <upstream values go here>
+```
+
+1. Paste in the contents of the upstream `values.yaml` here, intended 4 spaces, and then change the values you need as illustrated below.
+
+Here are some suggested changes to the defaults which you should consider:
+
+```yaml
+toolbox:
+  enabled: true # (1)!
+monitoring:
+  # enabling will also create RBAC rules to allow Operator to create ServiceMonitors
+  enabled: true # (2)!
+  # whether to create the prometheus rules
+  createPrometheusRules: true # (3)!
+pspEnable: false # (4)!
+ingress:
+  dashboard: {} # (5)!
+```
+
+1. It's useful to have a "toolbox" pod to shell into to run ceph CLI commands
+2. Consider enabling if you already have Prometheus installed
+3. Consider enabling if you already have Prometheus installed
+4. PSPs are deprecated, and will eventually be removed in Kubernetes 1.25, at which point this will cause breakage.  
+5. Customize the ingress configuration for your dashboard
+
+Further to the above, decide which disks you want to dedicate to Ceph, and add to the `cephClusterSpec` section.
+
+The default configuration (below) will cause the operator to use any un-formatted disks found on any of your nodes. If this is what you **want** to happen, then you don't need to change anything.
+
+```yaml
+cephClusterSpec:
+  storage: # cluster level storage configuration and selection
+    useAllNodes: true
+    useAllDevices: true
+```
+
+If you'd rather be a little more selective / declarative about which disks are used in a homogenous cluster, you could consider using `deviceFilter`, like this:
+
+```yaml
+cephClusterSpec:
+  storage: # cluster level storage configuration and selection
+    useAllNodes: true
+    useAllDevices: false
+    deviceFilter: sdc #(1)!
+```
+
+1. A regex to use to filter target devices found on each node
+
+If your cluster nodes are a little more snowflakey :snowflake:, here's a complex example:
+
+```yaml
+cephClusterSpec:
+  storage: # cluster level storage configuration and selection
+    useAllNodes: false
+    useAllDevices: false
+    nodes:
+    - name: "teeny-tiny-node"
+      deviceFilter: "." #(1)!
+    - name: "bigass-node"
+      devices:
+      - name: "/dev/disk/by-path/pci-0000:01:00.0-sas-exp0x500404201f43b83f-phy11-lun-0" #(2)!
+        config:
+          metadataDevice: "/dev/osd-metadata/11"
+      - name: "nvme0n1" #(3)!
+      - name: "nvme1n1"
+```
+
+1. Match any devices found on this node
+2. Match a very-specific device path, and pair this device with a faster device for OSD metadata
+3. Match devices with simple regex string matches
+
+### HelmRelease
+
+Finally, having set the scene above, we define the HelmRelease which will actually deploy the rook-ceph operator into the cluster. I save this in my flux repo:
+
+```yaml title="/rook-ceph-cluster/helmrelease-rook-ceph-cluster.yaml"
+apiVersion: helm.toolkit.fluxcd.io/v2beta1
+kind: HelmRelease
+metadata:
+  name: rook-ceph-cluster
+  namespace: rook-ceph
+spec:
+  chart:
+    spec:
+      chart: rook-ceph-cluster
+      version: 1.9.x
+      sourceRef:
+        kind: HelmRepository
+        name: rook-release
+        namespace: flux-system
+  interval: 30m
+  timeout: 10m
+  install:
+    remediation:
+      retries: 3
+  upgrade:
+    remediation:
+      retries: -1 # keep trying to remediate
+    crds: CreateReplace # Upgrade CRDs on package update
+  releaseName: rook-ceph-cluster
+  valuesFrom:
+  - kind: ConfigMap
+    name: rook-ceph-cluster-helm-chart-value-overrides
+    valuesKey: values.yaml # (1)!
+```
+
+1. This is the default, but best to be explicit for clarity
+
+## Install Rook Ceph Operator!
+
+Commit the changes to your flux repository, and either wait for the reconciliation interval, or force  a reconcilliation using `flux reconcile source git flux-system`. You should see the kustomization appear...
+
+```bash
+~ ❯ flux get kustomizations rook-ceph-cluster
+NAME             	READY	MESSAGE                       	REVISION    	SUSPENDED
+rook-ceph-cluster	True 	Applied revision: main/345ee5e	main/345ee5e	False
+~ ❯
+```
+
+The helmrelease should be reconciled...
+
+```bash
+~ ❯ flux get helmreleases -n rook-ceph rook-ceph 
+NAME             	READY	MESSAGE                         	REVISION	SUSPENDED
+rook-ceph-cluster	True 	Release reconciliation succeeded	v1.9.9  	False
+~ ❯
+```
+
+And you should have happy rook-ceph operator pods:
+
+```bash
+~ ❯ k get pods -n rook-ceph -l app=rook-ceph-operator
+NAME                                  READY   STATUS    RESTARTS   AGE
+rook-ceph-operator-7c94b7446d-nwsss   1/1     Running   0          5m14s
+~ ❯
+```
+
+To watch the operator do its magic, you can tail its logs, using:
+
+```bash
+k logs -n rook-ceph -f -l app=rook-ceph-operator
+```
+
+You can **get** or **describe** the status of your cephcluster:
+
+```bash
+~ ❯ k get cephclusters.ceph.rook.io  -n rook-ceph
+NAME        DATADIRHOSTPATH   MONCOUNT   AGE     PHASE   MESSAGE                        HEALTH      EXTERNAL
+rook-ceph   /var/lib/rook     3          6d22h   Ready   Cluster created successfully   HEALTH_OK
+~ ❯
+```
+
+### How do I know it's working?
+
+So we have a ceph cluster now, but how do we know we can actually provision volumes?
+
+#### Create PVCs
+
+Create two ceph-block PVCs (*persistent volume claim*), by running:
+
+```bash
+cat <<EOF | kubectl create -f -
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: ceph-block-pvc-1
+  labels:
+    test: ceph
+    funkypenguin-is: a-smartass  
+spec:
+  accessModes:
+    - ReadWriteOnce
+  storageClassName: ceph-block
+  resources:
+    requests:
+      storage: 128Mi
+EOF
+```
+
+And:
+
+```bash
+cat <<EOF | kubectl create -f -
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: ceph-block-pvc-2
+  labels:
+    test: ceph
+    funkypenguin-is: a-smartass  
+spec:
+  accessModes:
+    - ReadWriteOnce
+  storageClassName: ceph-block
+  resources:
+    requests:
+      storage: 128Mi
+EOF
+```
+
+Now create a ceph-filesystem (RWX) PVC, by running:
+
+```bash
+cat <<EOF | kubectl create -f -
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: ceph-filesystem-pvc
+  labels:
+    test: ceph
+    funkypenguin-is: a-smartass  
+spec:
+  accessModes:
+    - ReadWriteMany
+  storageClassName: ceph-filesystem
+  resources:
+    requests:
+      storage: 128Mi
+EOF
+```
+
+Examine the PVCs by running:
+
+```bash
+kubectl get pvc -l test=ceph
+```
+
+#### Create Pod
+
+Now create pods to consume the PVCs, by running:
+
+```bash
+cat <<EOF | kubectl create -f -
+apiVersion: v1
+kind: Pod
+metadata:
+  name: ceph-test-1
+  labels:
+    test: ceph
+    funkypenguin-is: a-smartass  
+spec:
+  containers:
+  - name: volume-test
+    image: nginx:stable-alpine
+    imagePullPolicy: IfNotPresent
+    volumeMounts:
+    - name: ceph-block-is-rwo
+      mountPath: /rwo
+    - name: ceph-filesystem-is-rwx
+      mountPath: /rwx
+    ports:
+    - containerPort: 80
+  volumes:
+  - name: ceph-block-is-rwo
+    persistentVolumeClaim:
+      claimName: ceph-block-pvc-1
+EOF
+```
+
+And:
+
+```bash
+cat <<EOF | kubectl create -f -
+apiVersion: v1
+kind: Pod
+metadata:
+  name: ceph-test-2
+  labels:
+    test: ceph
+    funkypenguin-is: a-smartass
+spec:
+  containers:
+  - name: volume-test
+    image: nginx:stable-alpine
+    imagePullPolicy: IfNotPresent
+    volumeMounts:
+    - name: ceph-block-is-rwo
+      mountPath: /rwo
+    - name: ceph-filesystem-is-rwx
+      mountPath: /rwx
+    ports:
+    - containerPort: 80
+  volumes:
+  - name: ceph-block-is-rwo
+    persistentVolumeClaim:
+      claimName: ceph-block-pvc-2
+  - name: ceph-filesystem-is-rwx
+    persistentVolumeClaim:
+      claimName: ceph-filesystem-pvc     
+EOF
+```
+
+Ensure the pods have started successfully (*this indicates the PVCs were correctly attached*) by running:
+
+```bash
+kubectl get pod -l test=ceph
+```
+
+#### Clean up
+
+Assuming that the pod is in a `Running` state, then TopoLVM is working!
+
+Clean up your mess, little bare-metal-cave-monkey :monkey_face:, by running:
+
+```bash
+kubectl delete pod -l funkypenguin-is=a-smartass
+kubectl delete pvc -l funkypenguin-is=a-smartass #(1)!
+```
+
+1. Label selectors are powerful!
+
+### View Ceph Dashboard
+
+Assuming you have an Ingress Controller setup, and you've either picked a default IngressClass, or defined the dashboard ingress appropriately, you should be able to access your Ceph Dashboard, at the URL identified by the ingress (*this is a good opportunity to check that the ingress deployed correctly*):
+
+```bash
+~ ❯ k get ingress -n rook-ceph
+NAME                      CLASS   HOSTS                          ADDRESS        PORTS     AGE
+rook-ceph-mgr-dashboard   nginx   rook-ceph.batcave.awesome.me   172.16.237.1   80, 443   177d
+~ ❯
+```
+
+The dashboard credentials are automatically generated for you by the operator, and stored in a Kubernetes secret. To retrieve your credentials, run:
+
+```bash
+kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o \
+jsonpath="{['data']['password']}" | base64 --decode && echo
+```
+
+## Summary
+
+What have we achieved? We're half-way to getting a ceph cluster, having deployed the operator which will manage the lifecycle of the [ceph cluster](/kubernetes/persistence/rook-ceph/cluster/) we're about to create!
+
+!!! summary "Summary"
+    Created:
+
+    * [X] Ceph cluster has been deployed
+    * [X] StorageClasses are available so that the cluster storage can be consumed by your pods
+    * [X] Pretty graphs are viewable in the Ceph Dashboard
+
+--8<-- "recipe-footer.md"
+
+[^1]: Unless you **wanted** to deploy your cluster components in a separate namespace to the operator, of course!
--- a/manuscript/kubernetes/persistence/rook-ceph/operator.md
+++ b/manuscript/kubernetes/persistence/rook-ceph/operator.md
@@ -3,14 +3,13 @@ title: Deploy Rook Ceph Operator for Persistent Storage in Kubernetes
 description: Start your Rook Ceph deployment by installing the operator into your Kubernetes cluster
 ---

-# Persistent storage in Kubernetes with Rook Ceph / CephFS
+# Persistent storage in Kubernetes with Rook Ceph / CephFS - Operator

 [Ceph](https://docs.ceph.com/en/quincy/) is a highly-reliable, scalable network storage platform which uses individual disks across participating nodes to provide fault-tolerant storage.

-![Ceph Screenshot](/images/ceph.png){ loading=lazy }
-
 [Rook](https://rook.io) provides an operator for Ceph, decomposing the [10-year-old](https://en.wikipedia.org/wiki/Ceph_(software)#Release_history), at-time-arcane, platform into cloud-native components, created declaratively, whose lifecycle is managed by an operator.

+To start off with, we need to deploy the ceph operator into the cluster, after which, we'll be able to actually deploy our [ceph cluster itself](/kubernetes/persistence/rook-ceph/cluster/).

 ## Rook Ceph requirements

@@ -27,7 +26,7 @@ description: Start your Rook Ceph deployment by installing the operator into you

 We need a namespace to deploy our HelmRelease and associated ConfigMaps into. Per the [flux design](/kubernetes/deployment/flux/), I create this example yaml in my flux repo at `/bootstrap/namespaces/namespace-rook-system.yaml`:

-```yaml title="/bootstrap/namespaces/namespace-mastodon.yaml"
+```yaml title="/bootstrap/namespaces/namespace-rook-ceph.yaml"
 apiVersion: v1
 kind: Namespace
 metadata:
@@ -177,7 +176,8 @@ What have we achieved? We're half-way to getting a ceph cluster, having deployed

    * [X] Rook ceph operator running and ready to deploy a cluster!

+    Next:
+
+    * [ ] Deploy the ceph [cluster](/kubernetes/persistence/rook-ceph/cluster/) using a CR
+
 --8<-- "recipe-footer.md"
-
-
-