1
0
mirror of https://github.com/funkypenguin/geek-cookbook/ synced 2025-12-13 17:56:26 +00:00

Updated doc structure (#9)

This commit is contained in:
David Young
2017-08-04 22:34:41 +12:00
committed by GitHub
parent 05a146f11c
commit e9d0bb822e
89 changed files with 25 additions and 10363 deletions

View File

@@ -1,11 +0,0 @@
# How to read this book
## Structure
1. "Recipies" generally follow on from each other. I.e., if a particular recipe requires a mail server, that mail server would have been described in an earlier recipe.
2. Each recipe contains enough detail in a single page to take a project from start to completion.
3. When there are optional add-ons/integrations possible to a project (i.e., the addition of "smart LED bulbs" to Home Assistant), this will be reflected either as a brief "Chef's note" after the recipe, or if they're substantial enough, as a sub-page of the main project
## Conventions
1. When creating swarm networks, we always explicitly set the subnet in the overlay network, to avoid potential conflicts (which docker won't prevent, but which will generate errors) (https://github.com/moby/moby/issues/26912)

View File

@@ -1,113 +0,0 @@
# Introduction
[Tiny Tiny RSS][ttrss] is a self-hosted, AJAX-based RSS reader, which rose to popularity as a replacement for Google Reader. It supports advanced features, such as:
* Plugins and themeing in a drop-in fashion
* Filtering (discard all articles with title matching "trump")
* Sharing articles via a unique public URL/feed
Tiny Tiny RSS requires a database and a webserver - this recipe provides both using docker, exposed to the world via LetsEncrypt.
# Ingredients
**Required**
1. Webserver (nginx container)
2. Database (postgresql container)
3. TTRSS (ttrss container)
3. Nginx reverse proxy with LetsEncrypt
**Optional**
1. Email server (if you want to email articles from TTRSS)
# Preparation
**Setup filesystem location**
I setup a directory for the ttrss data, at /data/ttrss.
I created docker-compose.yml, as follows:
````
rproxy:
image: nginx:1.13-alpine
ports:
- "34804:80"
environment:
- DOMAIN_NAME=ttrss.funkypenguin.co.nz
- VIRTUAL_HOST=ttrss.funkypenguin.co.nz
- LETSENCRYPT_HOST=ttrss.funkypenguin.co.nz
- LETSENCRYPT_EMAIL=davidy@funkypenguin.co.nz
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
volumes_from:
- ttrss
links:
- ttrss:ttrss
ttrss:
image: tkaefer/docker-ttrss
restart: always
links:
- postgres:database
environment:
- DB_USER=ttrss
- DB_PASS=uVL53xfmJxW
- SELF_URL_PATH=https://ttrss.funkypenguin.co.nz
volumes:
- ./plugins.local:/var/www/plugins.local
- ./themes.local:/var/www/themes.local
- ./reader:/var/www/reader
postgres:
image: postgres:latest
volumes:
- /srv/ssd-data/ttrss/db:/var/lib/postgresql/data
restart: always
environment:
- POSTGRES_USER=ttrss
- POSTGRES_PASSWORD=uVL53xfmJxW
gmailsmtp:
image: softinnov/gmailsmtp
restart: always
environment:
- user=davidy@funkypenguin.co.nz
- pass=eqknehqflfbufzbh
- DOMAIN_NAME=gmailsmtp.funkypenguin.co.nz
````
Run ````docker-compose up```` in the same directory, and watch the output. PostgreSQL container will create the "ttrss" database, and ttrss will start using it.
# Login to UI
Log into https://\<your VIRTUALHOST\>. Default user is "admin" and password is "password"
# Optional - Enable af_psql_trgm plugin for similar post detection
One of the native plugins enables the detection of "similar" articles. This requires the pg_trgm extension enabled in your database.
From the working directory, use ````docker exec```` to get a shell within your postgres container, and run "postgres" as the postgres user:
````
[root@kvm nginx]# docker exec -it ttrss_postgres_1 /bin/sh
# su - postgres
No directory, logging in with HOME=/
$ psql
psql (9.6.3)
Type "help" for help.
````
Add the trgm extension to your ttrss database:
````
postgres=# \c ttrss
You are now connected to database "ttrss" as user "postgres".
ttrss=# CREATE EXTENSION pg_trgm;
CREATE EXTENSION
ttrss=# \q
````
[ttrss]:https://tt-rss.org/

View File

@@ -1,15 +0,0 @@
<!-- Piwik -->
<script type="text/javascript">
var _paq = _paq || [];
/* tracker methods like "setCustomDimension" should be called before "trackPageView" */
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
var u="//piwik.funkypenguin.co.nz/";
_paq.push(['setTrackerUrl', u+'piwik.php']);
_paq.push(['setSiteId', '2']);
var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s);
})();
</script>
<!-- End Piwik Code -->

View File

@@ -1,88 +0,0 @@
# Design
In the design described below, the "private cloud" platform is:
* **Highly-available** (_can tolerate the failure of a single component_)
* **Scalable** (_can add resource or capacity as required_)
* **Portable** (_run it on your garage server today, run it in AWS tomorrow_)
* **Secure** (_access protected with LetsEncrypt certificates_)
* **Automated** (_requires minimal care and feeding_)
## Design Decisions
**Where possible, services will be highly available.**
This means that:
* At least 3 docker swarm manager nodes are required, to provide fault-tolerance of a single failure.
* GlusterFS is employed for share filesystem, because it too can be made tolerant of a single failure.
**Where multiple solutions to a requirement exist, preference will be given to the most portable solution.**
This means that:
* Services are defined using docker-compose v3 YAML syntax
* Services are portable, meaning a particular stack could be shut down and moved to a new provider with minimal effort.
## Security
Under this design, the only inbound connections we're permitting to our docker swarm are:
### Network Flows
* HTTP (TCP 80) : Redirects to https
* HTTPS (TCP 443) : Serves individual docker containers via SSL-encrypted reverse proxy
### Authentication
* Where the proxied application provides a trusted level of authentication, or where the application requires public exposure,
## High availability
### Normal function
Assuming 3 nodes, under normal circumstances the following is illustrated:
* All 3 nodes provide shared storage via GlusterFS, which is provided by a docker container on each node. (i.e., not running in swarm mode)
* All 3 nodes participate in the Docker Swarm as managers.
* The various containers belonging to the application "stacks" deployed within Docker Swarm are automatically distributed amongst the swarm nodes.
* Persistent storage for the containers is provide via GlusterFS mount.
* The **traefik** service (in swarm mode) receives incoming requests (on http and https), and forwards them to individual containers. Traefik knows the containers names because it's able to access the docker socket.
* All 3 nodes run keepalived, at different priorities. Since traefik is running as a swarm service and listening on TCP 80/443, requests made to the keepalived VIP and arriving at **any** of the swarm nodes will be forwarded to the traefik container (no matter which node it's on), and then onto the target backend.
![HA function](images/docker-swarm-ha-function.png)
### Node failure
In the case of a failure (or scheduled maintenance) of one of the nodes, the following is illustrated:
* The failed node no longer participates in GlusterFS, but the remaining nodes provide enough fault-tolerance for the cluster to operate.
* The remaining two nodes in Docker Swarm achieve a quorum and agree that the failed node is to be removed.
* The (possibly new) leader manager node reschedules the containers known to be running on the failed node, onto other nodes.
* The **traefik** service is either restarted or unaffected, and as the backend containers stop/start and change IP, traefik is aware and updates accordingly.
* The keepalived VIP continues to function on the remaining nodes, and docker swarm continues to forward any traffic received on TCP 80/443 to the appropriate node.
![HA function](images/docker-swarm-node-failure.png)
### Node restore
When the failed (or upgraded) host is restored to service, the following is illustrated:
* GlusterFS regains full redundancy
* Docker Swarm managers become aware of the recovered node, and will use it for scheduling **new** containers
* Existing containers which were migrated off the node are not migrated backend
* Keepalived VIP regains full redundancy
![HA function](images/docker-swarm-node-restore.png)
### Total cluster failure
A day after writing this, my environment suffered a fault whereby all 3 VMs were unexpectedly and simultaneously powered off.
Upon restore, docker failed to start on one of the VMs due to local disk space issue[^1]. However, the other two VMs started, established the swarm, mounted their shared storage, and started up all the containers (services) which were managed by the swarm.
In summary, although I suffered an **unplanned power outage to all of my infrastructure**, followed by a **failure of a third of my hosts**... ==all my platforms are 100% available with **absolutely no manual intervention**==.
[^1]: Since there's no impact to availability, I can fix (or just reinstall) the failed node whenever convenient.

View File

@@ -1,252 +0,0 @@
# Docker Swarm Mode
For truly highly-available services with Docker containers, we need an orchestration system. Docker Swarm (as defined at 1.13) is the simplest way to achieve redundancy, such that a single docker host could be turned off, and none of our services will be interrupted.
## Ingredients
* 3 x CentOS Atomic hosts (bare-metal or VMs). A reasonable minimum would be:
* 1 x vCPU
* 1GB repo_name
* 10GB HDD
* Hosts must be within the same subnet, and connected on a low-latency link (i.e., no WAN links)
## Preparation
### Release the swarm!
Now, to launch my swarm:
```docker swarm init```
Yeah, that was it. Now I have a 1-node swarm.
```
[root@ds1 ~]# docker swarm init
Swarm initialized: current node (b54vls3wf8xztwfz79nlkivt8) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-2orjbzjzjvm1bbo736xxmxzwaf4rffxwi0tu3zopal4xk4mja0-bsud7xnvhv4cicwi7l6c9s6l0 \
202.170.164.47:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
[root@ds1 ~]#
```
Run ```docker node ls``` to confirm that I have a 1-node swarm:
```
[root@ds1 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
b54vls3wf8xztwfz79nlkivt8 * ds1.funkypenguin.co.nz Ready Active Leader
[root@ds1 ~]#
```
Note that when I ran ```docker swarm init``` above, the CLI output gave me a command to run to join further nodes to my swarm. This would join the nodes as __workers__ (as opposed to __managers__). Workers can easily be promoted to managers (and demoted again), but since we know that we want our other two nodes to be managers too, it's simpler just to add them to the swarm as managers immediately.
On the first swarm node, generate the necessary token to join another manager by running ```docker swarm join-token manager```:
```
[root@ds1 ~]# docker swarm join-token manager
To add a manager to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-2orjbzjzjvm1bbo736xxmxzwaf4rffxwi0tu3zopal4xk4mja0-cfm24bq2zvfkcwujwlp5zqxta \
202.170.164.47:2377
[root@ds1 ~]#
```
Run the command provided on your second node to join it to the swarm as a manager. After adding the second node, the output of ```docker node ls``` (on either host) should reflect two nodes:
````
[root@ds2 davidy]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
b54vls3wf8xztwfz79nlkivt8 ds1.funkypenguin.co.nz Ready Active Leader
xmw49jt5a1j87a6ihul76gbgy * ds2.funkypenguin.co.nz Ready Active Reachable
[root@ds2 davidy]#
````
Repeat the process to add your third node. **You need a new token for the third node, don't re-use the manager token you generated for the second node**.
!!! warning "Seriously. Don't use a token more than once, else it's swarm-rebuilding time."
Finally, ```docker node ls``` should reflect that you have 3 reachable manager nodes, one of whom is the "Leader":
```
[root@ds3 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
36b4twca7i3hkb7qr77i0pr9i ds1.openstack.dev.safenz.net Ready Active Reachable
l14rfzazbmibh1p9wcoivkv1s * ds3.openstack.dev.safenz.net Ready Active Reachable
tfsgxmu7q23nuo51wwa4ycpsj ds2.openstack.dev.safenz.net Ready Active Leader
[root@ds3 ~]#
```
### Create registry mirror
Although we now have shared storage for our persistent container data, our docker nodes don't share any other docker data, such as container images. This results in an inefficiency - every node which participates in the swarm will, at some point, need the docker image for every container deployed in the swarm.
When dealing with large container (looking at you, GitLab!), this can result in several gigabytes of wasted bandwidth per-node, and long delays when restarting containers on an alternate node. (_It also wastes disk space on each node, but we'll get to that in the next section_)
The solution is to run an official Docker registry container as a ["pull-through" cache, or "registry mirror"](https://docs.docker.com/registry/recipes/mirror/). By using our persistent storage for the registry cache, we can ensure we have a single copy of all the containers we've pulled at least once. After the first pull, any subsequent pulls from our nodes will use the cached version from our registry mirror. As a result, services are available more quickly when restarting container nodes, and we can be more aggressive about cleaning up unused containers on our nodes (more later)
The registry mirror runs as a swarm stack, using a simple docker-compose.yml. Customize __your mirror FQDN__ below, so that Traefik will generate the appropriate LetsEncrypt certificates for it, and make it available via HTTPS.
```
version: "3"
services:
registry-mirror:
image: registry:2
networks:
- traefik
deploy:
labels:
- traefik.frontend.rule=Host:<your mirror FQDN>
- traefik.docker.network=traefik
- traefik.port=5000
ports:
- 5000:5000
volumes:
- /var/data/registry/registry-mirror-data:/var/lib/registry
- /var/data/registry/registry-mirror-config.yml:/etc/docker/registry/config.yml
networks:
traefik:
external: true
```
!!! note "Unencrypted registry"
We create this registry without consideration for SSL, which will fail if we attempt to use the registry directly. However, we're going to use the HTTPS-proxied version via Traefik, leveraging Traefik to manage the LetsEncrypt certificates required.
Create registry/registry-mirror-config.yml as follows:
```
version: 0.1
log:
fields:
service: registry
storage:
cache:
blobdescriptor: inmemory
filesystem:
rootdirectory: /var/lib/registry
delete:
enabled: true
http:
addr: :5000
headers:
X-Content-Type-Options: [nosniff]
health:
storagedriver:
enabled: true
interval: 10s
threshold: 3
proxy:
remoteurl: https://registry-1.docker.io
```
### Enable registry mirror and experimental features
To tell docker to use the registry mirror, and in order to be able to watch the logs of any service from any manager node (_an experimental feature in the current Atomic docker build_), edit **/etc/docker-latest/daemon.json** on each node, and change from:
```
{
"log-driver": "journald",
"signature-verification": false
}
```
To:
```
{
"log-driver": "journald",
"signature-verification": false,
"experimental": true,
"registry-mirrors": ["https://<your registry mirror FQDN>"]
}
```
!!! tip ""
Note the extra comma required after "false" above
### Setup automated cleanup
This needs to be a docker-compose.yml file, excluding trusted images (like glusterfs, traefik, etc)
```
docker run -d \
-v /var/run/docker.sock:/var/run/docker.sock:rw \
-v /var/lib/docker:/var/lib/docker:rw \
meltwater/docker-cleanup:latest
```
### Tweaks
Add some handy bash auto-completion for docker. Without this, you'll get annoyed that you can't autocomplete ```docker stack deploy <blah> -c <blah.yml>``` commands.
```
cd /etc/bash_completion.d/
curl -O https://raw.githubusercontent.com/docker/cli/b75596e1e4d5295ac69b9934d1bd8aff691a0de8/contrib/completion/bash/docker
```
Install some useful bash aliases on each host
```
cd ~
curl -O https://gitlab.funkypenguin.co.nz/funkypenguin/geeks-cookbook-recipies/raw/master/bash/gcb-aliases.sh
echo 'source ~/gcb-aliases.sh' >> ~/.bash_profile
```
````
mkdir ~/dockersock
cd ~/dockersock
curl -O https://raw.githubusercontent.com/dpw/selinux-dockersock/master/Makefile
curl -O https://raw.githubusercontent.com/dpw/selinux-dockersock/master/dockersock.te
make && semodule -i dockersock.pp
````
## Setup registry
docker run -d \
-p 5000:5000 \
--restart=always \
--name registry \
-v /mnt/registry:/var/lib/registry \
registry:2
{
"log-driver": "journald",
"signature-verification": false,
"experimental": true,
"registry-mirrors": ["https://registry-mirror.funkypenguin.co.nz"]
}
registry-mirror:
image: registry:2
ports:
- 5000:5000
environment:
volumes:
- /var/data/registry:/var/lib/registry
[root@ds1 dockersock]# docker swarm join-token manager
To add a manager to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-09c94wv0opw0y6xg67uzjl13pnv8lxxn586hrg5f47spso9l6j-6zn3dxk7c4zkb19r61owasi15 \
192.168.31.11:2377
[root@ds1 dockersock]#

Binary file not shown.

Before

Width:  |  Height:  |  Size: 314 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 333 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 310 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 43 KiB

View File

@@ -1,70 +0,0 @@
# Keepalived
While having a self-healing, scalable docker swarm is great for availability and scalability, none of that is any good if nobody can connect to your cluster.
In order to provide seamless external access to clustered resources, regardless of which node they're on and tolerant of node failure, you need to present a single IP to the world for external access.
Normally this is done using a HA loadbalancer, but since Docker Swarm aready provides the load-balancing capabilities (routing mesh), all we need for seamless HA is a virtual IP which will be provided by more than one docker node.
This is accomplished with the use of keepalived on at least two nodes.
## Ingredients
```
Already deployed:
[X] At least 2 x CentOS/Fedora Atomic VMs
[X] low-latency link (i.e., no WAN links)
New:
[ ] 3 x IPv4 addresses (one for each node and one for the virtual IP)
```
## Preparation
### Enable IPVS module
On all nodes which will participate in keepalived, we need the "ip_vs" kernel module, in order to permit serivces to bind to non-local interface addresses.
Set this up once for both the primary and secondary nodes, by running:
```
echo "modprobe ip_vs" >> /etc/rc.local
modprobe ip_vs
```
### Setup nodes
Assuming your IPs are as follows:
* 192.168.4.1 : Primary
* 192.168.4.2 : Secondary
* 192.168.4.3 : Virtual
Run the following on the primary
```
docker run -d --name keepalived --restart=always \
--cap-add=NET_ADMIN --net=host \
-e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['192.168.4.1', '192.168.4.2']" \
-e KEEPALIVED_VIRTUAL_IPS=192.168.4.3 \
-e KEEPALIVED_PRIORITY=200 \
osixia/keepalived:1.3.5
```
And on the secondary:
```
docker run -d --name keepalived --restart=always \
--cap-add=NET_ADMIN --net=host \
-e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['192.168.4.1', '192.168.4.2']" \
-e KEEPALIVED_VIRTUAL_IPS=192.168.4.3 \
-e KEEPALIVED_PRIORITY=100 \
osixia/keepalived:1.3.5
```
## Serving
That's it. Each node will talk to the other via unicast (no need to un-firewall multicast addresses), and the node with the highest priority gets to be the master. When ingress traffic arrives on the master node via the VIP, docker's routing mesh will deliver it to the appropriate docker node.
## Chef's notes
1. Some hosting platforms (OpenStack, for one) won't allow you to simply "claim" a virtual IP. Each node is only able to receive traffic targetted to its unique IP. In this case, keepalived is not the right solution, and a platform-specific load-balancing solution should be used. In OpenStack, this is Neutron's "Load Balancer As A Service" (LBAAS) component. AWS and Azure would likely include similar protections.
2. More than 2 nodes can participate in keepalived. Simply ensure that each node has the appropriate priority set, and the node with the highest priority will become the master.

View File

@@ -1,83 +0,0 @@
# Introduction
## Adding a host
## Adding storage
gluster volume add-brick VOLNAME NEW_BRICK
example
# gluster volume add-brick test-volume server4:/exp4
Add Brick successful
# Replacing failed host
Followed https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Replacing_Hosts.html
[root@glusterfs-server /]# gluster peer status
Number of Peers: 1
Hostname: ds1
Uuid: db9c80da-11e4-461d-8ea5-66dd12ca897c
State: Peer in Cluster (Disconnected)
[root@glusterfs-server /]#
Grab UUID above
edit /var/lib/glusterd/glusterd.info
change:
UUID=aee45c2c-aa19-4d29-bc94-4833f2b22863
to
UUID=db9c80da-11e4-461d-8ea5-66dd12ca897c
My peer's id (ds2):
[root@glusterfs-server /]# gluster system:: uuid get
UUID: 38ca4e8b-8ef5-4165-9f41-5c8b3f0103cc
[root@glusterfs-server /]#
vi /var/lib/glusterd/peers/38ca4e8b-8ef5-4165-9f41-5c8b3f0103cc
UUID=38ca4e8b-8ef5-4165-9f41-5c8b3f0103cc
state=3
hostname=ds3
Got volume info
[root@glusterfs-server /]# gluster volume info
Volume Name: gv0
Type: Replicate
Volume ID: 84e1169c-41dc-467a-9ae1-a474efaf789f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: ds1:/var/no-direct-write-here/brick1/gv0
Brick2: ds3:/var/no-direct-write-here/brick1/gv0
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
[root@glusterfs-server /]#
----
[root@glusterfs-server /]# getfattr -d -m. -ehex /var/no-direct-write-here/brick1/gv0/
getfattr: Removing leading '/' from absolute path names
# file: var/no-direct-write-here/brick1/gv0/
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x84e1169c41dc467a9ae1a474efaf789f
[root@glusterfs-server /]#
setfattr -n trusted.glusterfs.volume-id -v 0x84e1169c41dc467a9ae1a474efaf789f /var/no-direct-write-here/brick1/gv0

View File

@@ -1,186 +0,0 @@
# Shared Storage (Ceph)
While Docker Swarm is great for keeping containers running (_and restarting those that fail_), it does nothing for persistent storage. This means if you actually want your containers to keep any data persistent across restarts (_hint: you do!_), you need to provide shared storage to every docker node.
## Design
### Why not GlusterFS?
I originally provided shared storage to my nodes using GlusterFS (see the next recipe for details), but found it difficult to deal with because:
1. GlusterFS requires (n) "bricks", where (n) **has** to be a multiple of your replica count. I.e., if you want 2 copies of everything on shared storage (the minimum to provide redundancy), you **must** have either 2, 4, 6 (etc..) bricks. The HA swarm design calls for minimum of 3 nodes, and so under GlusterFS, my third node can't participate in shared storage at all, unless I start doubling up on bricks-per-node (which then impacts redundancy)
2. GlusterFS turns out to be a giant PITA when you want to restore a failed node. There are at [least 14 steps to follow](https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Replacing_Hosts.html) to replace a brick.
3. I'm pretty sure I messed up the 14-step process above anyway. My replaced brick synced with my "original" brick, but produced errors when querying status via the CLI, and hogged 100% of 1 CPU on the replaced node. Inexperienced with GlusterFS, and unable to diagnose the fault, I switched to a Ceph cluster instead.
### Why Ceph?
1. I'm more familiar with Ceph - I use it in the OpenStack designs I manage
2. Replacing a failed node is **easy**, provided you can put up with the I/O load of rebalancing OSDs after the replacement.
3. CentOS Atomic includes the ceph client in the OS, so while the Ceph OSD/Mon/MSD are running under containers, I can keep an eye (and later, automatically monitor) the status of Ceph from the base OS.
## Ingredients
!!! summary "Ingredients"
3 x Virtual Machines (configured earlier), each with:
* [X] CentOS/Fedora Atomic
* [X] At least 1GB RAM
* [X] At least 20GB disk space (_but it'll be tight_)
* [X] Connectivity to each other within the same subnet, and on a low-latency link (_i.e., no WAN links_)
* [ ] A second disk dedicated to the Ceph OSD
## Preparation
### SELinux
Since our Ceph components will be containerized, we need to ensure the SELinux context on the base OS's ceph files is set correctly:
```
chcon -Rt svirt_sandbox_file_t /etc/ceph
chcon -Rt svirt_sandbox_file_t /var/lib/ceph
```
### Setup Monitors
Pick a node, and run the following to stand up the first Ceph mon. Be sure to replace the values for **MON_IP** and **CEPH_PUBLIC_NETWORK** to those specific to your deployment:
```
docker run -d --net=host \
--restart always \
-v /etc/ceph:/etc/ceph \
-v /var/lib/ceph/:/var/lib/ceph/ \
-e MON_IP=192.168.31.11 \
-e CEPH_PUBLIC_NETWORK=192.168.31.0/24 \
--name="ceph-mon" \
ceph/daemon mon
```
Now **copy** the contents of /etc/ceph on this first node to the remaining nodes, and **then** run the docker command above (_customizing MON_IP as you go_) on each remaining node. You'll end up with a cluster with 3 monitors (odd number is required for quorum, same as Docker Swarm), and no OSDs (yet)
### Setup OSDs
Since we have a OSD-less mon-only cluster currently, prepare for OSD creation by dumping the auth credentials for the OSDs into the appropriate location on the base OS:
```
ceph auth get client.bootstrap-osd -o \
/var/lib/ceph/bootstrap-osd/ceph.keyring
```
On each node, you need a dedicated disk for the OSD. In the example below, I used _/dev/vdd_ (the entire disk, no partitions) for the OSD.
Run the following command on every node:
```
docker run -d --net=host \
--privileged=true \
--pid=host \
-v /etc/ceph:/etc/ceph \
-v /var/lib/ceph/:/var/lib/ceph/ \
-v /dev/:/dev/ \
-e OSD_DEVICE=/dev/vdd \
-e OSD_TYPE=disk \
--name="ceph-osd" \
--restart=always \
ceph/daemon osd
```
Watch the output by running ```docker logs ceph-osd -f```, and confirm success.
!!! note "Zapping the device"
The Ceph OSD container will refuse to destroy a partition containing existing data, so it may be necessary to "zap" the target disk, using:
```
docker run -d --privileged=true \
-v /dev/:/dev/ \
-e OSD_DEVICE=/dev/sdd \
ceph/daemon zap_device
```
### Setup MDSs
In order to mount our ceph pools as filesystems, we'll need Ceph MDS(s). Run the following on each node:
```
docker run -d --net=host \
--name ceph-mds \
--restart always \
-v /var/lib/ceph/:/var/lib/ceph/ \
-v /etc/ceph:/etc/ceph \
-e CEPHFS_CREATE=1 \
-e CEPHFS_DATA_POOL_PG=256 \
-e CEPHFS_METADATA_POOL_PG=256 \
ceph/daemon mds
```
### Apply tweaks
The ceph container seems to configure a pool default of 3 replicas (3 copies of each block are retained), which is one too many for our cluster (we are only protecting against the failure of a single node).
Run the following on any node to reduce the size of the pool to 2 replicas:
```
ceph osd pool set cephfs_data size 2
ceph osd pool set cephfs_metadata size 2
```
Disabled "scrubbing" (which can be IO-intensive, and is unnecessary on a VM) with:
```
ceph osd set noscrub
ceph osd set nodeep-scrub
```
### Create credentials for swarm
In order to mount the ceph volume onto our base host, we need to provide cephx authentication credentials.
On **one** node, create a client for the docker swarm:
```
ceph auth get-or-create client.dockerswarm osd \
'allow rw' mon 'allow r' mds 'allow' > /etc/ceph/keyring.dockerswarm
```
Grab the secret associated with the new user (you'll need this for the /etc/fstab entry below) by running:
```
ceph-authtool /etc/ceph/keyring.dockerswarm -p -n client.dockerswarm
```
### Mount MDS volume
On each noie, create a mountpoint for the data, by running ```mkdir /var/data```, add an entry to fstab to ensure the volume is auto-mounted on boot, and ensure the volume is actually _mounted_ if there's a network / boot delay getting access to the gluster volume:
```
mkdir /var/data
MYHOST=`hostname -s`
echo -e "
# Mount cephfs volume \n
$MYHOST:6789:/ /var/data/ ceph \
name=dockerswarm\
,secret=<YOUR SECRET HERE>\
,noatime,_netdev,context=system_u:object_r:svirt_sandbox_file_t:s0\
0 2" >> /etc/fstab
mount -a
```
### Install docker-volume plugin
Upstream bug for docker-latest reported at https://bugs.centos.org/view.php?id=13609
And the alpine fault:
https://github.com/gliderlabs/docker-alpine/issues/317
## Serving
After completing the above, you should have:
```
[X] Persistent storage available to every node
[X] Resiliency in the event of the failure of a single node
```
## Chef's Notes
Future enhancements to this recipe include:
1. Rather than pasting a secret key into /etc/fstab (which feels wrong), I'd prefer to be able to set "secretfile" in /etc/fstab (which just points ceph.mount to a file containing the secret), but under the current CentOS Atomic, we're stuck with "secret", per https://bugzilla.redhat.com/show_bug.cgi?id=1030402

View File

@@ -1,164 +0,0 @@
# Shared Storage (GlusterFS)
While Docker Swarm is great for keeping containers running (_and restarting those that fail_), it does nothing for persistent storage. This means if you actually want your containers to keep any data persistent across restarts (_hint: you do!_), you need to provide shared storage to every docker node.
## Design
### Why GlusterFS?
This GlusterFS recipe was my original design for shared storage, but I [found it to be flawed](ha-docker-swarm/shared-storage-ceph/#why-not-glusterfs), and I replaced it with a [design which employs Ceph instead](http://localhost:8000/ha-docker-swarm/shared-storage-ceph/#why-ceph). This recipe is an alternate to the Ceph design, if you happen to prefer GlusterFS.
## Ingredients
!!! summary "Ingredients"
3 x Virtual Machines (configured earlier), each with:
* [X] CentOS/Fedora Atomic
* [X] At least 1GB RAM
* [X] At least 20GB disk space (_but it'll be tight_)
* [X] Connectivity to each other within the same subnet, and on a low-latency link (_i.e., no WAN links_)
* [ ] A second disk, or adequate space on the primary disk for a dedicated data partition
## Preparation
### Create Gluster "bricks"
To build our Gluster volume, we need 2 out of the 3 VMs to provide one "brick". The bricks will be used to create the replicated volume. Assuming a replica count of 2 (_i.e., 2 copies of the data are kept in gluster_), our total number of bricks must be divisible by our replica count. (_I.e., you can't have 3 bricks if you want 2 replicas. You can have 4 though - We have to have minimum 3 swarm manager nodes for fault-tolerance, but only 2 of those nodes need to run as gluster servers._)
On each host, run a variation following to create your bricks, adjusted for the path to your disk.
!!! note "The example below assumes /dev/vdb is dedicated to the gluster volume"
```
(
echo o # Create a new empty DOS partition table
echo n # Add a new partition
echo p # Primary partition
echo 1 # Partition number
echo # First sector (Accept default: 1)
echo # Last sector (Accept default: varies)
echo w # Write changes
) | sudo fdisk /dev/vdb
mkfs.xfs -i size=512 /dev/vdb1
mkdir -p /var/no-direct-write-here/brick1
echo '' >> /etc/fstab >> /etc/fstab
echo '# Mount /dev/vdb1 so that it can be used as a glusterfs volume' >> /etc/fstab
echo '/dev/vdb1 /var/no-direct-write-here/brick1 xfs defaults 1 2' >> /etc/fstab
mount -a && mount
```
!!! warning "Don't provision all your LVM space"
Atomic uses LVM to store docker data, and **automatically grows** Docker's volumes as requried. If you commit all your free LVM space to your brick, you'll quickly find (as I did) that docker will start to fail with error messages about insufficient space. If you're going to slice off a portion of your LVM space in /dev/atomicos, make sure you leave enough space for Docker storage, where "enough" depends on how much you plan to pull images, make volumes, etc. I ate through 20GB very quickly doing development, so I ended up provisioning 50GB for atomic alone, with a separate volume for the brick.
### Create glusterfs container
Atomic doesn't include the Gluster server components. This means we'll have to run glusterd from within a container, with privileged access to the host. Although convoluted, I've come to prefer this design since it once again makes the OS "disposable", moving all the config into containers and code.
Run the following on each host:
````
docker run \
-h glusterfs-server \
-v /etc/glusterfs:/etc/glusterfs:z \
-v /var/lib/glusterd:/var/lib/glusterd:z \
-v /var/log/glusterfs:/var/log/glusterfs:z \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /var/no-direct-write-here/brick1:/var/no-direct-write-here/brick1 \
-d --privileged=true --net=host \
--restart=always \
--name="glusterfs-server" \
gluster/gluster-centos
````
### Create trusted pool
On a single node (doesn't matter which), run ```docker exec -it glusterfs-server bash``` to launch a shell inside the container.
From the node, run
```gluster peer probe <other host>```
Example output:
```
[root@glusterfs-server /]# gluster peer probe ds1
peer probe: success.
[root@glusterfs-server /]#
```
Run ```gluster peer status``` on both nodes to confirm that they're properly connected to each other:
Example output:
```
[root@glusterfs-server /]# gluster peer status
Number of Peers: 1
Hostname: ds3
Uuid: 3e115ba9-6a4f-48dd-87d7-e843170ff499
State: Peer in Cluster (Connected)
[root@glusterfs-server /]#
```
### Create gluster volume
Now we create a *replicated volume* out of our individual "bricks".
Create the gluster volume by running
```
gluster volume create gv0 replica 2 \
server1:/var/no-direct-write-here/brick1 \
server2:/var/no-direct-write-here/brick1
```
Example output:
```
[root@glusterfs-server /]# gluster volume create gv0 replica 2 ds1:/var/no-direct-write-here/brick1/gv0 ds3:/var/no-direct-write-here/brick1/gv0
volume create: gv0: success: please start the volume to access data
[root@glusterfs-server /]#
```
Start the volume by running ```gluster volume start gv0```
```
[root@glusterfs-server /]# gluster volume start gv0
volume start: gv0: success
[root@glusterfs-server /]#
```
The volume is only present on the host you're shelled into though. To add the other hosts to the volume, run ```gluster peer probe <servername>```. Don't probe host from itself.
From one other host, run ```docker exec -it glusterfs-server bash``` to shell into the gluster-server container, and run ```gluster peer probe <original server name>``` to update the name of the host which started the volume.
### Mount gluster volume
On the host (i.e., outside of the container - type ```exit``` if you're still shelled in), create a mountpoint for the data, by running ```mkdir /var/data```, add an entry to fstab to ensure the volume is auto-mounted on boot, and ensure the volume is actually _mounted_ if there's a network / boot delay getting access to the gluster volume:
```
mkdir /var/data
MYHOST=`hostname -s`
echo '' >> /etc/fstab >> /etc/fstab
echo '# Mount glusterfs volume' >> /etc/fstab
echo "$MYHOST:/gv0 /var/data glusterfs defaults,_netdev,context="system_u:object_r:svirt_sandbox_file_t:s0" 0 0" >> /etc/fstab
mount -a
```
For some reason, my nodes won't auto-mount this volume on boot. I even tried the trickery below, but they stubbornly refuse to automount.
```
echo -e "\n\n# Give GlusterFS 10s to start before \
mounting\nsleep 10s && mount -a" >> /etc/rc.local
systemctl enable rc-local.service
```
For non-gluster nodes, you'll need to replace $MYHOST above with the name of one of the gluster hosts (I haven't worked out how to make this fully HA yet)
## Serving
After completing the above, you should have:
```
[X] Persistent storage available to every node
[X] Resiliency in the event of the failure of a single (gluster) node
```
## Chef's Notes
Future enhancements to this recipe include:
1. Migration of shared storage from GlusterFS to Ceph ()[#2](https://gitlab.funkypenguin.co.nz/funkypenguin/geeks-cookbook/issues/2))
2. Correct the fact that volumes don't automount on boot ([#3](https://gitlab.funkypenguin.co.nz/funkypenguin/geeks-cookbook/issues/3))

View File

@@ -1,146 +0,0 @@
# Traefik
The platforms we plan to run on our cloud are generally web-based, and each listening on their own unique TCP port. When a container in a swarm exposes a port, then connecting to **any** swarm member on that port will result in your request being forwarded to the appropriate host running the container. (_Docker calls this the swarm "[routing mesh](https://docs.docker.com/engine/swarm/ingress/)"_)
So we get a rudimentary load balancer built into swarm. We could stop there, just exposing a series of ports on our hosts, and making them HA using keepalived.
There are some gaps to this approach though:
- No consideration is given to HTTPS. Implementation would have to be done manually, per-container.
- No mechanism is provided for authentication outside of that which the container providers. We may not **want** to expose every interface on every container to the world, especially if we are playing with tools or containers whose quality and origin are unknown.
To deal with these gaps, we need a front-end load-balancer, and in this design, that role is provided by [Traefik](https://traefik.io/).
## Ingredients
## Preparation
### Prepare the host
The traefik container is aware of the __other__ docker containers in the swarm, because it has access to the docker socket at **/var/run/docker.sock**. This allows traefik to dynamically configure itself based on the labels found on containers in the swarm, which is hugely useful. To make this functionality work on our SELinux-enabled Atomic hosts, we need to add custom SELinux policy.
Run the following to build and activate policy to permit containers to access docker.sock:
```
mkdir ~/dockersock
cd ~/dockersock
curl -O https://raw.githubusercontent.com/dpw/\
selinux-dockersock/master/Makefile
curl -O https://raw.githubusercontent.com/dpw/\
selinux-dockersock/master/dockersock.te
make && semodule -i dockersock.pp
```
### Prepare traefik.toml
While it's possible to configure traefik via docker command arguments, I prefer to create a config file (traefik.toml). This allows me to change traefik's behaviour by simply changing the file, and keeps my docker config simple.
Create /var/data/traefik/traefik.toml as follows:
```
checkNewVersion = true
defaultEntryPoints = ["http", "https"]
# This section enable LetsEncrypt automatic certificate generation / renewal
[acme]
email = "<your LetsEncrypt email address>"
storage = "acme.json" # or "traefik/acme/account" if using KV store
entryPoint = "https"
acmeLogging = true
onDemand = true
OnHostRule = true
[[acme.domains]]
main = "<your primary domain>"
# Redirect all HTTP to HTTPS (why wouldn't you?)
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.http.redirect]
entryPoint = "https"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[web]
address = ":8080"
watch = true
[docker]
endpoint = "tcp://127.0.0.1:2375"
domain = "<your primary domain>"
watch = true
swarmmode = true
```
### Prepare the docker service config
Create /var/data/traefik/docker-compose.yml as follows:
```
version: "3.2"
services:
traefik:
image: traefik
command: --web --docker --docker.swarmmode --docker.watch --docker.domain=funkypenguin.co.nz --logLevel=DEBUG
ports:
- target: 80
published: 80
protocol: tcp
mode: host
- target: 443
published: 443
protocol: tcp
mode: host
- target: 8080
published: 8080
protocol: tcp
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /var/data/traefik/traefik.toml:/traefik.toml:ro
- /var/data/traefik/acme.json:/acme.json
labels:
- "traefik.enable=false"
networks:
- public
deploy:
mode: global
placement:
constraints: [node.role == manager]
restart_policy:
condition: on-failure
networks:
public:
driver: overlay
ipam:
driver: default
config:
- subnet: 10.1.0.0/24
```
Docker won't start an image with a bind-mount to a non-existent file, so prepare acme.json by running ```touch /var/data/traefik/acme.json```.
### Launch
Deploy traefik with ```docker stack deploy traefik -c /var/data/traefik/docker-compose.yml```
Confirm traefik is running with ```docker stack ps traefik```
## Serving
You now have:
1. Frontend proxy which will dynamically configure itself for new backend containers
2. Automatic SSL support for all proxied resources
## Chef's Notes
Additional features I'd like to see in this recipe are:
1. Include documentation of oauth2_proxy container for protecting individual backends
2. Traefik webUI is available via HTTPS, protected with oauth_proxy
3. Pending a feature in docker-swarm to avoid NAT on routing-mesh-delivered traffic, update the design

View File

@@ -1,80 +0,0 @@
# Virtual Machines
Let's start building our cloud with virtual machines. You could use bare-metal machines as well, the configuration would be the same. Given that most readers (myself included) will be using virtual infrastructure, from now on I'll be referring strictly to VMs.
I chose the "[Atomic](https://www.projectatomic.io/)" CentOS/Fedora image for the VM layer because:
1. I want less responsibility for maintaining the system, including ensuring regular software updates and reboots. Atomic's idempotent nature means the OS is largely real-only, and updates/rollbacks are "atomic" (haha) procedures, which can be easily rolled back if required.
2. For someone used to administrating servers individually, Atomic is a PITA. You have to employ [tricky](atomic-trick2) [tricks](atomic-trick1) to get it to install in a non-cloud environment. It's not designed for tweaking or customizing beyond what cloud-config is capable of. For my purposes, this is good, because it forces me to change my thinking - to consider every daemon as a container, and every config as code, to be checked in and version-controlled. Atomic forces this thinking on you.
3. I want the design to be as "portable" as possible. While I run it on VPSs now, I may want to migrate it to a "cloud" provider in the future, and I'll want the most portable, reproducible design.
[atomic-trick1]:https://spinningmatt.wordpress.com/2014/01/08/a-recipe-for-starting-cloud-images-with-virt-install/
[atomic-trick2]:http://blog.oddbit.com/2015/03/10/booting-cloud-images-with-libvirt/
## Ingredients
!!! summary "Ingredients"
3 x Virtual Machines, each with:
* [ ] CentOS/Fedora Atomic
* [ ] At least 1GB RAM
* [ ] At least 20GB disk space (_but it'll be tight_)
* [ ] Connectivity to each other within the same subnet, and on a low-latency link (_i.e., no WAN links_)
## Preparation
### Install Virtual machines
1. Install / launch virtual machines.
2. The default username on CentOS atomic is "centos", and you'll have needed to supply your SSH key during the build process.
!!! tip
If you're not using a platform with cloud-init support (i.e., you're building a VM manually, not provisioning it through a cloud provider), you'll need to refer to [trick #1][atomic-trick1] and [#2][atomic-trick2] for a means to override the automated setup, apply a manual password to the CentOS account, and enable SSH password logins.
### Prefer docker-latest
Run the following on each node to replace the default docker 1.12 with docker 1.13 (_which we need for swarm mode_):
```
systemctl disable docker --now
systemctl enable docker-latest --now
sed -i '/DOCKERBINARY/s/^#//g' /etc/sysconfig/docker
```
### Upgrade Atomic
Finally, apply any Atomic host updates, and reboot, by running: ```atomic host upgrade && systemctl reboot```.
### Permit connectivity between VMs
By default, Atomic only permits incoming SSH. We'll want to allow all traffic between our nodes, so add something like this to /etc/sysconfig/iptables:
```
# Allow all inter-node communication
-A INPUT -s 192.168.31.0/24 -j ACCEPT
```
And restart iptables with ```systemctl restart iptables```
### Enable host resolution
Depending on your hosting environment, you may have DNS automatically setup for your VMs. If not, it's useful to set up static entries in /etc/hosts for the nodes. For example, I setup the following:
```
192.168.31.11 ds1 ds1.funkypenguin.co.nz
192.168.31.12 ds2 ds2.funkypenguin.co.nz
192.168.31.13 ds3 ds3.funkypenguin.co.nz
```
## Serving
After completing the above, you should have:
```
[X] 3 x fresh atomic instances, at the latest releases,
running Docker v1.13 (docker-latest)
```

Binary file not shown.

Before

Width:  |  Height:  |  Size: 135 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 140 KiB

View File

@@ -1,39 +0,0 @@
# What is this?
The "**[Geek's Cookbook](https://geek-cookbook.funkypenguin.co.nz)**" is a collection of guides for establishing your own highly-available docker container cluster (swarm). This swarm enables you to run self-hosted services such as [GitLab](https://gitlab.com/), [Plex](https://www.plex.tv/), [NextCloud](https://nextcloud.com), etc.
## Who is this for?
You already have a familiarity with concepts such as [virtual](https://libvirt.org/) [machines](https://www.virtualbox.org/), [Docker](https://www.docker.com/) containers, [LetsEncrypt SSL certificates](https://letsencrypt.org/), databases, and command-line interfaces.
You've probably played with self-hosting some mainstream apps yourself, like [Plex](https://www.plex.tv/), [OwnCloud](https://owncloud.org/), [Wordpress](https://wordpress.org/) or even [SandStorm](https://sandstorm.io/).
## Why should I read this?
So if you're familiar enough with the tools, and you've done self-hosting before, why would you read this book?
1. You want to upskill. You want to do container orchestration, LetsEncrypt certificates, git collaboration.
2. You want to play. You want a safe sandbox to test new tools, keeping the ones you want and tossing the ones you don't.
3. You want reliability. Once you go from __playing__ with a tool to actually __using__ it, you want it to be available when you need it. Having to "_quickly ssh into the host and restart the webserver_" doesn't cut it when your wife wants to know why her phone won't sync!
## What do you want from me?
I want your money.
No, seriously (_but yes, I do want your money - see below_), If the above applies to you, then you're like me. I want everything I wrote above, so I ended up learning all this as I went along. I enjoy it, and I'm good at it. So I created this website, partly to make sure I documented my own setup properly.
## How can I support you?
### Buy my book 📖
I'm also writing it as a formal book, on Leanpub (https://leanpub.com/geeks-cookbook). Buy it for $0.99 (which is really just a token gesture of support) - you can get it for free (in PDF, mobi, or epub format), or pay me what you think it's worth!
### [Patreonize me 💰](https://www.patreon.com/funkypenguin)
<a href="https://www.patreon.com/bePatron?u=6982506" data-patreon-widget-type="become-patron-button">Become a Patron!</a><script async src="https://c6.patreon.com/becomePatronButton.bundle.js"></script>
- [My Patreon page](https://www.patreon.com/funkypenguin)!
### Hire me 🏢
Need some system design work done? I do freelance consulting - [contact](https://www.funkypenguin.co.nz/contact/) me for details.

View File

@@ -1,60 +0,0 @@
# Gitlab Runner
Some features of GitLab require a "[runner](https://docs.gitlab.com/runner/)" (_in the sense of a "gopher" or a "minion"_). A runner "registers" itself with a GitLab instance, and is given tasks to run. Tasks include running Continuous Integration (CI) builds, and building container images.
While a runner isn't strictly required to use GitLab, if you want to do CI, you'll need at least one. There are many was to deploy a runner - this recipe focuses on the docker container model.
## Ingredients
1. [Docker swarm cluster](/ha-docker-swarm/) with [persistent shared storage](/ha-docker-swarm/shared-storage-ceph.md)
2. [GitLab](/ha-docker-swarm/gitlab) installation (see previous recipe)
## Preparation
### Setup data locations
We'll need several directories to bind-mount into our runner containers, so create them in /var/data/gitlab:
```
cd /var/data
mkdir gitlab
cd gitlab
mkdir -p {runners/1,runners/2}
```
### Configure runners
From your GitLab UI, you can retrieve a "token" necessary to register a new runner. To register the runner, you can either create config.toml in each runner's bind-mounted folder (example below), or just "docker exec" into each runner container and execute ```gitlab-container register``` to interactively generate config.toml.
Sample runner config.toml:
```
concurrent = 1
check_interval = 0
[[runners]]
name = "myrunner1"
url = "https://gitlab.example.com"
token = "<long string here>"
executor = "docker"
[runners.docker]
tls_verify = false
image = "ruby:2.1"
privileged = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
```
## Serving
### Launch runners
Launch the mail server stack by running ```docker stack deploy gitlab-runner -c <path -to-docker-compose.yml>```
Log into your new instance at https://**YOUR-FQDN**, with user "root" and the password you specified in gitlab.env.
## Chef's Notes
1. You'll note that I setup 2 runners. One is locked to a single project (this cookbook build), and the other is a shared runner. I wanted to ensure that one runner was always available to run CI for this project, even if I'd tied up another runner on something heavy-duty, like a container build. Customize this to your use case.
2. Originally I deployed runners in the same stack as GitLab, but I found that they would frequently fail to start properly when I launched the stack. I think that this was because the runners started so quickly (and GitLab starts so slowly!), that they always started up reporting that the GitLab instance was invalid or unavailable. I had issues with CI builds stuck permanently in a "pending" state, which were only resolved by restarting the runner. Having the runners deployed in a separate stack to GitLab avoids this problem.

View File

@@ -1,124 +0,0 @@
# GitLab
GitLab is a self-hosted [alternative to GitHub](https://about.gitlab.com/comparison/). The most common use case is (a set of) developers with the desire for the rich feature-set of GitHub, but with unlimited private repositories.
Docker does maintain an [official "Omnibus" container](https://docs.gitlab.com/omnibus/docker/README.html), but for this recipe I prefer the "[dockerized gitlab](https://github.com/sameersbn/docker-gitlab)" project, since it allows distribution of the various Gitlab components across multiple swarm nodes.
## Ingredients
1. [Docker swarm cluster](/ha-docker-swarm/) with [persistent shared storage](/ha-docker-swarm/shared-storage-ceph.md)
2. [Traefik](/ha-docker-swarm/traefik) configured per design
## Preparation
### Setup data locations
We'll need several directories to bind-mount into our container, so create them in /var/data/gitlab:
```
cd /var/data
mkdir gitlab
cd gitlab
mkdir -p {postgresql,redis,gitlab}
```
### Prepare environment
You'll need to know the following:
1. Choose a password for postgresql, you'll need it for DB_PASS in the compose file (below)
2. Generate 3 passwords using ```pwgen -Bsv1 64```. You'll use these for the XXX_KEY_BASE environment variables below
2. Create gitlab.env, and populate with **at least** the following variables (the full set is available at https://github.com/sameersbn/docker-gitlab#available-configuration-parameters):
```
DB_USER=gitlab
DB_PASS=<as determined above>
TZ=Pacific/Auckland
GITLAB_TIMEZONE=Auckland
GITLAB_HTTPS=true
SSL_SELF_SIGNED=false
GITLAB_HOST
GITLAB_PORT
GITLAB_SSH_PORT
GITLAB_SECRETS_DB_KEY_BASE
GITLAB_SECRETS_SECRET_KEY_BASE
GITLAB_SECRETS_OTP_KEY_BASE
GITLAB_ROOT_PASSWORD
```
### Setup Docker Swarm
Create a docker swarm config file in docker-compose syntax (v3), something like this:
```
version: '3'
services:
redis:
image: sameersbn/redis:latest
command:
- --loglevel warning
volumes:
- /var/data/gitlab/redis:/var/lib/redis:Z
networks:
- internal
postgresql:
image: sameersbn/postgresql:9.6-2
volumes:
- /var/data/gitlab/postgresql:/var/lib/postgresql:Z
networks:
- internal
environment:
- DB_USER=gitlab
- DB_PASS=<your db password>
- DB_NAME=gitlabhq_production
- DB_EXTENSION=pg_trgm
gitlab:
image: sameersbn/gitlab:latest
networks:
- internal
- traefik
deploy:
labels:
- traefik.frontend.rule=Host:gitlab.example.com
- traefik.docker.network=traefik
- traefik.port=80
restart_policy:
delay: 10s
max_attempts: 10
window: 60s
ports:
- "10022:22"
volumes:
- /var/data/gitlab/gitlab:/home/git/data:Z
env_file: gitlab.env
networks:
traefik:
external: true
internal:
driver: overlay
ipam:
config:
- subnet: 172.16.2.0/24
-
```
!!! tip
Setup unique static subnets for every stack you deploy. This avoids IP/gateway conflicts which can otherwise occur when you're creating/removing stacks a lot. See [my list](/reference/networks/) here.
## Serving
### Launch gitlab
Launch the mail server stack by running ```docker stack deploy gitlab -c <path -to-docker-compose.yml>```
Log into your new instance at https://[your FQDN], with user "root" and the password you specified in gitlab.env.
## Chef's Notes
A few comments on decisions taken in this design:
1. I use the **sameersbn/gitlab:latest** image, rather than a specific version. This lets me execute updates simply by redeploying the stack (and why **wouldn't** I want the latest version?)

View File

@@ -1,152 +0,0 @@
# Mail Server
Many of the recipies that follow require email access of some kind. It's normally possible to use a hosted service such as SendGrid, or just a gmail account. If (like me) you'd like to self-host email for your stacks, then the following recipe provides a full-stack mail server running on the docker HA swarm.
Of value to me in choosing docker-mailserver were:
1. Automatically renews LetsEncrypt certificates
2. Creation of email accounts across multiple domains (i.e., the same container gives me mailbox wekan@wekan.example.com, and gitlab@gitlab.example.com)
3. The entire configuration is based on flat files, so there's no database or persistence to worry about
docker-mailserver doesn't include a webmail client, and one is not strictly needed. Rainloop can be added either as another service within the stack, or as a standalone service. Rainloop will be covered in a future recipe.
## Ingredients
1. [Docker swarm cluster](/ha-docker-swarm/) with [persistent shared storage](/ha-docker-swarm/shared-storage-ceph.md)
2. [Traefik](/ha-docker-swarm/traefik) configured per design
3. LetsEncrypt authorized email address for domain
4. Access to manage DNS records for domains
## Preparation
### Setup data locations
We'll need several directories to bind-mount into our container, so create them in /var/data/mailserver:
```
cd /var/data
mkdir mailserver
cd mailserver
mkdir {maildata,mailstate,config,letsencrypt}
```
### Get LetsEncrypt certificate
Decide on the FQDN to assign to your mailserver. You can service multiple domains from a single mailserver - i.e., bob@dev.example.com and daphne@prod.example.com can both be served by **mail.example.com**.
The docker-mailserver container can _renew_ our LetsEncrypt certs for us, but it can't generate them. To do this, we need to run certbot (from a container) to request the initial certs and create the appropriate directory structure.
In the example below, since I'm already using Traefik to manage the LE certs for my web platforms, I opted to use the DNS challenge to prove my ownership of the domain. The certbot client will prompt you to add a DNS record for domain verification.
```
docker run -ti --rm -v \
"$(pwd)"/letsencrypt:/etc/letsencrypt certbot/certbot \
--manual --preferred-challenges dns certonly \
-d mail.example.com
```
### Get setup.sh
docker-mailserver comes with a handy bash script for managing the stack (which is just really a wrapper around the container.) It'll make our setup easier, so download it into the root of your configuration/data directory, and make it executable:
```
curl -o setup.sh \
https://raw.githubusercontent.com/tomav/docker-mailserver/master/setup.sh \
chmod a+x ./setup.sh
```
### Create email accounts
For every email address required, run ```./setup.sh email add <email> <password>``` to create the account. The command returns no output.
You can run ```./setup.sh email list``` to confirm all of your addresses have been created.
### Create DKIM DNS entries
Run ```./setup.sh config dkim``` to create the necessary DKIM entries. The command returns no output.
Examine the keys created by opendkim to identify the DNS TXT records required:
```
for i in `find config/opendkim/keys/ -name mail.txt`; do \
echo $i; \
cat $i; \
done
```
You'll end up with something like this:
```
config/opendkim/keys/gitlab.example.com/mail.txt
mail._domainkey IN TXT ( "v=DKIM1; k=rsa; "
"p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCYuQqDg2ZG8ZOfI1PvarF1Gcr5cJnCR8BeCj5HYgeRohSrxKL5utPEF/AWAxXYwnKpgYN837fu74GfqsIuOhu70lPhGV+O2gFVgpXYWHELvIiTqqO0QgarIN63WE2gzE4s0FckfLrMuxMoXr882wuzuJhXywGxOavybmjpnNHhbQIDAQAB" ) ; ----- DKIM key mail for gitlab.example.com
[root@ds1 mail]#
```
Create the necessary DNS TXT entries for your domain(s). Note that although opendkim splits the record across two lines, the actual record should be concatenated on creation. I.e., the DNS TXT record above should read:
```
"v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCYuQqDg2ZG8ZOfI1PvarF1Gcr5cJnCR8BeCj5HYgeRohSrxKL5utPEF/AWAxXYwnKpgYN837fu74GfqsIuOhu70lPhGV+O2gFVgpXYWHELvIiTqqO0QgarIN63WE2gzE4s0FckfLrMuxMoXr882wuzuJhXywGxOavybmjpnNHhbQIDAQAB"
```
### Setup Docker Swarm
Create a docker swarm config file in docker-compose syntax (v3), something like this:
```
version: '3'
services:
mail:
image: tvial/docker-mailserver:latest
ports:
- "25:25"
- "587:587"
- "993:993"
volumes:
- /var/data/mail/maildata:/var/mail
- /var/data/mail/mailstate:/var/mail-state
- /var/data/mail/config:/tmp/docker-mailserver
- /var/data/mail/letsencrypt:/etc/letsencrypt
env_file: /var/data/mail/.env
networks:
- internal
deploy:
replicas: 1
networks:
traefik:
external: true
internal:
driver: overlay
ipam:
config:
- subnet: 172.16.2.0/24
```
!!! tip
Setup unique static subnets for every stack you deploy. This avoids IP/gateway conflicts which can otherwise occur when you're creating/removing stacks a lot.
A sample .env file looks like this:
```
ENABLE_SPAMASSASSIN=1
ENABLE_CLAMAV=1
ENABLE_POSTGREY=1
ONE_DIR=1
OVERRIDE_HOSTNAME=mail.example.com
OVERRIDE_DOMAINNAME=mail.example.com
POSTMASTER_ADDRESS=admin@example.com
PERMIT_DOCKER=network
SSL_TYPE=letsencrypt
```
## Serving
### Launch mailserver
Launch the mail server stack by running ```docker stack deploy mailserver -c <path -to-docker-compose.yml>```
## Chef's Notes
1. One of the elements of this design which I didn't appreciate at first is that since the config is entirely file-based, **setup.sh** can be run on any container host, provided it has the shared data mounted. This means that even though docker-mailserver was not designed with docker swarm in mind, it works perfectl with swarm. I.e., from any node, regardless of where the container is actually running, you're able to add/delete email addresses, view logs, etc.

View File

@@ -1,115 +0,0 @@
# Wekan
Wekan is an open-source kanban board which allows a card-based task and to-do management, similar to tools like WorkFlowy or Trello.
![Wekan Screenshot](../images/wekan.jpg)
Wekan allows to create Boards, on which Cards can be moved around between a number of Columns. Boards can have many members, allowing for easy collaboration, just add everyone that should be able to work with you on the board to it, and you are good to go! You can assign colored Labels to cards to facilitate grouping and filtering, additionally you can add members to a card, for example to assign a task to someone.
There's a [video](https://www.youtube.com/watch?v=N3iMLwCNOro) of the developer showing off the app, as well as a f[unctional demo](https://wekan.indie.host/b/t2YaGmyXgNkppcFBq/wekan-fork-roadmap).
!!! note
For added privacy, this design secures wekan behind an [oauth2 proxy](/reference/oauth_proxy/), so that in order to gain access to the wekan UI at all, oauth2 authentication (to GitHub, GitLab, Google, etc) must have already occured.
## Ingredients
1. [Docker swarm cluster](/ha-docker-swarm/) with [persistent shared storage](/ha-docker-swarm/shared-storage-ceph.md)
2. [Traefik](/ha-docker-swarm/traefik) configured per design
## Preparation
### Setup data locations
We'll need several directories to bind-mount into our container, so create them in /var/data/wekan:
```
mkdir /var/data/wekan
cd /var/data/wekan
mkdir -p {wekan-db,wekan-db-dump}
```
### Prepare environment
You'll need to know the following:
1. Choose an oauth provider, and obtain a client ID and secret
2. Create wekan.env, and populate with the following variables
```
OAUTH2_PROXY_CLIENT_ID=
OAUTH2_PROXY_CLIENT_SECRET=
OAUTH2_PROXY_COOKIE_SECRET=
MONGO_URL=mongodb://wekandb:27017/wekan
ROOT_URL=https://wekan.example.com
MAIL_URL=smtp://wekan@wekan.example.com:password@mail.example.com:587/
MAIL_FROM="Wekan <wekan@wekan.example.com>"
```
### Setup Docker Swarm
Create a docker swarm config file in docker-compose syntax (v3), something like this:
```
version: '3'
services:
wekandb:
image: mongo:3.2.15
command: mongod --smallfiles --oplogSize 128
networks:
- internal
volumes:
- /var/data/wekan/wekan-db:/data/db
- /var/data/wekan/wekan-db-dump:/dump
proxy:
image: zappi/oauth2_proxy
env_file: /var/data/wekan/wekan.env
networks:
- traefik
- internal
deploy:
labels:
- traefik.frontend.rule=Host:wekan.example.com
- traefik.docker.network=traefik
- traefik.port=4180
command: |
-cookie-secure=false
-upstream=http://wekan:80
-redirect-url=https://wekan.example.com
-http-address=http://0.0.0.0:4180
-email-domain=example.com
-provider=github
wekan:
image: wekanteam/wekan:latest
networks:
- internal
env_file: /var/data/wekan/wekan.env
networks:
traefik:
external: true
internal:
driver: overlay
ipam:
config:
- subnet: 172.16.3.0/24
```
!!! tip
Setup unique static subnets for every stack you deploy. This avoids IP/gateway conflicts which can otherwise occur when you're creating/removing stacks a lot. See [my list](/reference/networks/) here.
## Serving
### Launch Wekan stack
Launch the Wekan stack by running ```docker stack deploy wekan -c <path -to-docker-compose.yml>```
Log into your new instance at https://**YOUR-FQDN**, with user "root" and the password you specified in gitlab.env.
## Chef's Notes
1. If you wanted to expose the Wekan UI directly, you could remove the oauth2_proxy from the design, and move the traefik-related labels directly to the wekan container. You'd also need to add the traefik network to the wekan container.

View File

@@ -1,8 +0,0 @@
The workflow for creating a recipe
1. In my gitlab repo for the cookbook, I create an issue to track the new recipe I want to create
2. From the issue, I create a branch
3. I pull the branch locally, "git pull" followed by "git co <branch>"
4. I "git add" my changes, commit, and push the branch
5. I create a merge request in the branch to track merging into produced
6. I set request to merge when pipeline succeeds

View File

@@ -1,29 +0,0 @@
# Introduction
Our HA platform design relies on Atomic OS, which only contains bare minimum elements to run containers.
So how can we use git on this system, to push/pull the changes we make to config files?
docker run -v /var/data/git-docker/data:/root funkypenguin/git-docker ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519
Generating public/private ed25519 key pair.
Enter passphrase (empty for no passphrase): Enter same passphrase again: Created directory '/root/.ssh'.
Your identification has been saved in /root/.ssh/id_ed25519.
Your public key has been saved in /root/.ssh/id_ed25519.pub.
The key fingerprint is:
SHA256:uZtriS7ypx7Q4kr+w++nHhHpcRfpf5MhxP3Wpx3H3hk root@a230749d8d8a
The key's randomart image is:
+--[ED25519 256]--+
| .o . |
| . ..o . |
| + .... ...|
| .. + .o . . E=|
| o .o S . . ++B|
| . o . . . +..+|
| .o .. ... . . |
|o..o..+.oo |
|...=OX+.+. |
+----[SHA256]-----+
[root@ds3 data]#
alias git='docker run -v $PWD:/var/data -v /var/data/git-docker/data:/root funkypenguin/git-docker git'

View File

@@ -1,10 +0,0 @@
# Networks
In order to avoid IP addressing conflicts as we bring swarm networks up/down, we will statically address each docker overlay network, and record the details below:
Network | Range
--|--
[Traefik](/ha-docker-swarm/traefik/) | _unspecified_
[Mail Server](/recipies/mail/) | 172.16.1.0/24
[Gitlab](/recipies/gitlab/) | 172.16.2.0/24
[Wekan](/recipies/wekan/) | 172.16.3.0/24

View File

@@ -1,79 +0,0 @@
# OAuth proxy
Some of the platforms we use on our swarm may have strong, proven security to prevent abuse. Techniques such as rate-limiting (to defeat brute force attacks) or even support 2-factor authentication (tiny-tiny-rss or Wallabag support this).
Other platforms may provide **no authentication** (Traefik's web UI for example), or minimal, un-proven UI authentication which may have been added as an afterthought.
Still platforms may hold such sensitive data (i.e., NextCloud), that we'll feel more secure by putting an additional authentication layer in front of them.
This is the role of the OAuth proxy.
## How does it work?
**Normally**, Traefik proxies web requests directly to individual web apps running in containers. The user talks directly to the webapp, and the webapp is responsible for ensuring appropriate authentication.
When employing the **OAuth proxy** , the proxy sits in the middle of this transaction - traefik sends the web client to the OAuth proxy, the proxy authenticates the user against a 3rd-party source (_GitHub, Google, etc_), and then passes authenticated requests on to the web app in the container.
Illustrated below:
![OAuth proxy](/images/oauth_proxy.png)
The advantage under this design is additional security. If I'm deploying a web app which I expect only myself to require access to, I'll put the oauth_proxy in front of it. The overhead is negligible, and the additional layer of security is well-worth it.
## Ingredients
## Preparation
### OAuth provider
OAuth Proxy currently supports the following OAuth providers:
* Google (default)
* Azure
* Facebook
* GitHub
* GitLab
* LinkedIn
* MyUSA
Follow the [instructions](https://github.com/bitly/oauth2_proxy) to setup your oauth provider. You need to setup a unique key/secret for **each** instance of the proxy you want to run, since in each case the callback URL will differ.
### Authorized emails file
There are a variety of options with oauth_proxy re which email addresses (authenticated against your oauth provider) should be permitted access. You can permit access based on email domain (*@gmail.com), individual email address (batman@gmail.com), or based on provider-specific groups (_i.e., a GitHub organization_)
The most restrictive configuration allows access on a per-email address basis, which is illustrated below:
I created **/var/data/oauth_proxy/authenticated-emails.txt**, and add my own email address to the first line.
### Configure stack
You'll need to define a service for the oauth_proxy in every stack which you want to protect. Here's an example from the [Wekan](/recipies/wekan/) recipe:
```
proxy:
image: zappi/oauth2_proxy
env_file : /var/data/wekan/wekan.env
networks:
- traefik
- internal
deploy:
labels:
- traefik.frontend.rule=Host:wekan.funkypenguin.co.nz
- traefik.docker.network=traefik
- traefik.port=4180
volumes:
- /var/data/oauth_proxy/authenticated-emails.txt:/authenticated-emails.txt
command: |
-cookie-secure=false
-upstream=http://wekan:80
-redirect-url=https://wekan.funkypenguin.co.nz
-http-address=http://0.0.0.0:4180
-email-domain=funkypenguin.co.nz
-provider=github
-authenticated-emails-file=/authenticated-emails.txt
```
Note above how:
* Labels are required to tell Traefik to forward the traffic to the proxy, rather than the backend container running the app
* An environment file is defined, but..
* The redirect URL must still be passed to the oauth_proxy in the command argument

View File

@@ -1,39 +0,0 @@
# Welcome to Funky Penguin's Geek Cookbook
## Hello world,
I'm [David](https://www.funkypenguin.co.nz/contact/).
I've spent 20+ years working with technology. My current role is **Senior Infrastructure Architect** at [Prophecy Networks Ltd](http://www.prophecy.net.nz) in New Zealand, with a specific interest in networking, systems, open-source, and business management.
I've had a [book published](https://www.funkypenguin.co.nz/book/phplist-2-email-campaign-manager/), and I [blog](https://www.funkypenguin.co.nz/blog/) on topics that interest me.
## Why Funky Penguin?
My first "real" job, out of high-school, was working the IT helpdesk in a typical pre-2000 organization in South Africa. I enjoyed experimenting with Linux, and cut my teeth by replacing the organization's Exchange 5.5 mail platform with a 15-site [qmail-ldap](http://www.nrg4u.com/) cluster, with [amavis](https://en.wikipedia.org/wiki/Amavis) virus-scanning.
One of our suppliers asked me to quote to do the same for their organization. With nothing to loose, and half-expecting to be turned down, I quoted a generous fee, and chose a cheeky company name. The supplier immediately accepted my quote, and the name ("_Funky Penguin_") stuck.
## Technical Documentation
During the same "real" job above, I wanted to deploy [jabberd](https://en.wikipedia.org/wiki/Jabberd14), for internal instant messaging within the organization, and as a means to control the sprawl of ad-hoc instant-messaging among staff, using ICQ, MSN, and Yahoo Messenger.
To get management approval to deploy, I wrote a logger (with web UI) for jabber conversations ([Bandersnatch](https://www.funkypenguin.co.nz/project/bandersnatch/)), and a [75-page user manual](https://www.funkypenguin.co.nz/book/jajc-manual/) (in [Docbook XML](http://www.docbook.org/) for a spunky Russian WinXP jabber client, [JAJC](http://jajc.jrudevels.org/).
Due to my contributions to [phpList](http://www.phplist.com), I was approached in 2011 by [Packt Publishing](http://www.packtpub.com), to [write a book](https://www.funkypenguin.co.nz/book/phplist-2-email-campaign-manager) about using PHPList.
## Contact Me
Contact me by:
* Email ([davidy@funkypenguin.co.nz](mailto:davidy@funkypenguin.co.nz))
* Twitter ([@funkypenguin](https://twitter.com/funkypenguin))
* Mastodon ([@davidy@funkypenguin.co.nz](https://mastodon.funkypenguin.co.nz/@davidy))
Or by using the form below:
<div class="panel">
<iframe width="100%" height="400" frameborder="0" scrolling="no" src="https://funkypenguin.wufoo.com/forms/z16038vt0bk5txp/"></iframe>
</div>