From d23be9b7eb4e5486dcfd54148e36514f6a7201bd Mon Sep 17 00:00:00 2001 From: David Young Date: Thu, 18 Aug 2022 12:44:50 +1200 Subject: [PATCH] First cut at ceph docs Signed-off-by: David Young --- _snippets/premix-cta-kubernetes.md | 4 +- .../persistence/rook-ceph/cluster.md | 423 +++++++++++++++++- .../persistence/rook-ceph/operator.md | 14 +- 3 files changed, 432 insertions(+), 9 deletions(-) diff --git a/_snippets/premix-cta-kubernetes.md b/_snippets/premix-cta-kubernetes.md index 159ff8c..94b197a 100644 --- a/_snippets/premix-cta-kubernetes.md +++ b/_snippets/premix-cta-kubernetes.md @@ -1,4 +1,6 @@ !!! tip "Fast-track your fluxing! 🚀" Is crafting all these YAMLs by hand too much of a PITA? - I automatically and **instantly** share (_with my [sponsors](https://github.com/sponsors/funkypenguin)_) a private "[_premix_](https://geek-cookbook.funkypenguin.co.nz/premix/)" git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository! :thumbsup: \ No newline at end of file + I automatically and **instantly** share (_with my [sponsors](https://github.com/sponsors/funkypenguin)_) a private "[_premix_](https://geek-cookbook.funkypenguin.co.nz/premix/)" git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository, for each chosen recipe! + + Let the machines :material-robot-outline: do the TOIL! :man_lifting_weights: \ No newline at end of file diff --git a/manuscript/kubernetes/persistence/rook-ceph/cluster.md b/manuscript/kubernetes/persistence/rook-ceph/cluster.md index b4b5d7d..13672e1 100644 --- a/manuscript/kubernetes/persistence/rook-ceph/cluster.md +++ b/manuscript/kubernetes/persistence/rook-ceph/cluster.md @@ -1 +1,422 @@ -Working on this, check back soon! ;) \ No newline at end of file +--- +title: Deploy Rook Ceph's operater-managed Cluster for Persistent Storage in Kubernetes +description: Step #2 - Having the operator available, now we deploy the ceph cluster itself +--- + +# Persistent storage in Kubernetes with Rook Ceph / CephFS - Cluster + +[Ceph](https://docs.ceph.com/en/quincy/) is a highly-reliable, scalable network storage platform which uses individual disks across participating nodes to provide fault-tolerant storage. + +[Rook](https://rook.io) provides an operator for Ceph, decomposing the [10-year-old](https://en.wikipedia.org/wiki/Ceph_(software)#Release_history), at-time-arcane, platform into cloud-native components, created declaratively, whose lifecycle is managed by an operator. + +In the [previous recipe](/kubernetes/persistence/rook-ceph/operator/), we deployed the operator, and now to actually deploy a Ceph cluster, we need to deploy a custom resource (*a "CephCluster"*), which will instruct the operator on we'd like our cluster to be deployed. + +We'll end up with multilpe storageClasses which we can use to allocate storage to pods from either Ceph RBD (*block storage*), or CephFS (*a mounted filesystem*). In many cases, CephFS is a useful choice, because it can be mounted from more than one pod **at the same time**, which makes it suitable for apps which need to share access to the same data ([NZBGet][nzbget], [Sonarr][sonarr], and [Plex][plex], for example) + + +## Rook Ceph Cluster requirements + +!!! summary "Ingredients" + + Already deployed: + + * [x] A [Kubernetes cluster](/kubernetes/cluster/) + * [x] [Flux deployment process](/kubernetes/deployment/flux/) bootstrapped + * [x] Rook Ceph's [Operator](/kubernetes/persistence/rook-ceph/operator/) + +## Preparation + +### Namespace + +We already deployed a `rook-ceph` namespace when deploying the Rook Ceph [Operator](/kubernetes/persistence/rook-ceph/operator/), so we don't need to create this again :thumbsup: [^1] + +### HelmRepository + +Likewise, we'll install the `rook-ceph-cluster` helm chart from the same Rook-managed repository as we did the `rook-ceph` (operator) chart, so we don't need to create a new HelmRepository. + +### Kustomization + +We do, however, need a separate Kustomization for rook-ceph-cluster, telling flux to deploy any YAMLs found in the repo at `/rook-ceph-cluster`. I create this example Kustomization in my flux repo: + +!!! question "Why a separate Kustomization if both are needed for rook-ceph?" + While technically we **could** use the same Kustomization to deploy both `rook-ceph` and `rook-ceph-cluster`, we'd run into dependency issues. It's simpler and cleaner to deploy `rook-ceph` first, and then list it as a dependency for `rook-ceph-cluster`. + +```yaml title="/bootstrap/kustomizations/kustomization-rook-ceph-cluster.yaml" +apiVersion: kustomize.toolkit.fluxcd.io/v1beta2 +kind: Kustomization +metadata: + name: rook-ceph-cluster--rook-ceph + namespace: flux-system +spec: + dependsOn: + - name: "rook-ceph" # (1)! + interval: 30m + path: ./rook-ceph-cluster + prune: true # remove any elements later removed from the above path + timeout: 10m # if not set, this defaults to interval duration, which is 1h + sourceRef: + kind: GitRepository + name: flux-system + validation: server +``` + +1. Note that we use the `spec.dependsOn` to ensure that this Kustomization is only applied **after** the rook-ceph operator is deployed and operational. This ensures that the necessary CRDs are in place, and avoids a dry-run error on the reconciliation. + +--8<-- "premix-cta-kubernetes.md" + +### ConfigMap + +Now we're into the app-specific YAMLs. First, we create a ConfigMap, containing the entire contents of the helm chart's [values.yaml](https://github.com/rook/rook/blob/master/deploy/charts/rook-ceph/values.yaml). Paste the values into a `values.yaml` key as illustrated below, indented 4 tabs (*since they're "encapsulated" within the ConfigMap YAML*). I create this example yaml in my flux repo: + +```yaml title="/rook-ceph-cluster/configmap-rook-ceph-cluster-helm-chart-value-overrides.yaml" +apiVersion: v1 +kind: ConfigMap +metadata: + name: rook-ceph-cluster-helm-chart-value-overrides + namespace: rook-ceph-cluster +data: + values.yaml: |- # (1)! + # +``` + +1. Paste in the contents of the upstream `values.yaml` here, intended 4 spaces, and then change the values you need as illustrated below. + +Here are some suggested changes to the defaults which you should consider: + +```yaml +toolbox: + enabled: true # (1)! +monitoring: + # enabling will also create RBAC rules to allow Operator to create ServiceMonitors + enabled: true # (2)! + # whether to create the prometheus rules + createPrometheusRules: true # (3)! +pspEnable: false # (4)! +ingress: + dashboard: {} # (5)! +``` + +1. It's useful to have a "toolbox" pod to shell into to run ceph CLI commands +2. Consider enabling if you already have Prometheus installed +3. Consider enabling if you already have Prometheus installed +4. PSPs are deprecated, and will eventually be removed in Kubernetes 1.25, at which point this will cause breakage. +5. Customize the ingress configuration for your dashboard + +Further to the above, decide which disks you want to dedicate to Ceph, and add to the `cephClusterSpec` section. + +The default configuration (below) will cause the operator to use any un-formatted disks found on any of your nodes. If this is what you **want** to happen, then you don't need to change anything. + +```yaml +cephClusterSpec: + storage: # cluster level storage configuration and selection + useAllNodes: true + useAllDevices: true +``` + +If you'd rather be a little more selective / declarative about which disks are used in a homogenous cluster, you could consider using `deviceFilter`, like this: + +```yaml +cephClusterSpec: + storage: # cluster level storage configuration and selection + useAllNodes: true + useAllDevices: false + deviceFilter: sdc #(1)! +``` + +1. A regex to use to filter target devices found on each node + +If your cluster nodes are a little more snowflakey :snowflake:, here's a complex example: + +```yaml +cephClusterSpec: + storage: # cluster level storage configuration and selection + useAllNodes: false + useAllDevices: false + nodes: + - name: "teeny-tiny-node" + deviceFilter: "." #(1)! + - name: "bigass-node" + devices: + - name: "/dev/disk/by-path/pci-0000:01:00.0-sas-exp0x500404201f43b83f-phy11-lun-0" #(2)! + config: + metadataDevice: "/dev/osd-metadata/11" + - name: "nvme0n1" #(3)! + - name: "nvme1n1" +``` + +1. Match any devices found on this node +2. Match a very-specific device path, and pair this device with a faster device for OSD metadata +3. Match devices with simple regex string matches + +### HelmRelease + +Finally, having set the scene above, we define the HelmRelease which will actually deploy the rook-ceph operator into the cluster. I save this in my flux repo: + +```yaml title="/rook-ceph-cluster/helmrelease-rook-ceph-cluster.yaml" +apiVersion: helm.toolkit.fluxcd.io/v2beta1 +kind: HelmRelease +metadata: + name: rook-ceph-cluster + namespace: rook-ceph +spec: + chart: + spec: + chart: rook-ceph-cluster + version: 1.9.x + sourceRef: + kind: HelmRepository + name: rook-release + namespace: flux-system + interval: 30m + timeout: 10m + install: + remediation: + retries: 3 + upgrade: + remediation: + retries: -1 # keep trying to remediate + crds: CreateReplace # Upgrade CRDs on package update + releaseName: rook-ceph-cluster + valuesFrom: + - kind: ConfigMap + name: rook-ceph-cluster-helm-chart-value-overrides + valuesKey: values.yaml # (1)! +``` + +1. This is the default, but best to be explicit for clarity + +## Install Rook Ceph Operator! + +Commit the changes to your flux repository, and either wait for the reconciliation interval, or force a reconcilliation using `flux reconcile source git flux-system`. You should see the kustomization appear... + +```bash +~ ❯ flux get kustomizations rook-ceph-cluster +NAME READY MESSAGE REVISION SUSPENDED +rook-ceph-cluster True Applied revision: main/345ee5e main/345ee5e False +~ ❯ +``` + +The helmrelease should be reconciled... + +```bash +~ ❯ flux get helmreleases -n rook-ceph rook-ceph +NAME READY MESSAGE REVISION SUSPENDED +rook-ceph-cluster True Release reconciliation succeeded v1.9.9 False +~ ❯ +``` + +And you should have happy rook-ceph operator pods: + +```bash +~ ❯ k get pods -n rook-ceph -l app=rook-ceph-operator +NAME READY STATUS RESTARTS AGE +rook-ceph-operator-7c94b7446d-nwsss 1/1 Running 0 5m14s +~ ❯ +``` + +To watch the operator do its magic, you can tail its logs, using: + +```bash +k logs -n rook-ceph -f -l app=rook-ceph-operator +``` + +You can **get** or **describe** the status of your cephcluster: + +```bash +~ ❯ k get cephclusters.ceph.rook.io -n rook-ceph +NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL +rook-ceph /var/lib/rook 3 6d22h Ready Cluster created successfully HEALTH_OK +~ ❯ +``` + +### How do I know it's working? + +So we have a ceph cluster now, but how do we know we can actually provision volumes? + +#### Create PVCs + +Create two ceph-block PVCs (*persistent volume claim*), by running: + +```bash +cat <