1
0
mirror of https://github.com/funkypenguin/geek-cookbook/ synced 2025-12-13 01:36:23 +00:00

Tidy up like it's 2019

This commit is contained in:
David Young
2019-05-15 10:54:50 +12:00
parent 480dee111a
commit 4654c131b2
97 changed files with 266 additions and 204 deletions

View File

@@ -1,11 +1,11 @@
# Design
In the design described below, the "private cloud" platform is:
In the design described below, our "private cloud" platform is:
* **Highly-available** (_can tolerate the failure of a single component_)
* **Scalable** (_can add resource or capacity as required_)
* **Portable** (_run it on your garage server today, run it in AWS tomorrow_)
* **Secure** (_access protected with LetsEncrypt certificates_)
* **Secure** (_access protected with [LetsEncrypt certificates](/ha-docker-swarm/traefik/) and optional [OIDC with 2FA](/ha-docker-swarm/traefik-forward-auth/)_)
* **Automated** (_requires minimal care and feeding_)
## Design Decisions
@@ -15,7 +15,10 @@ In the design described below, the "private cloud" platform is:
This means that:
* At least 3 docker swarm manager nodes are required, to provide fault-tolerance of a single failure.
* GlusterFS is employed for share filesystem, because it too can be made tolerant of a single failure.
* [Ceph](/ha-docker-swarm/shared-storage-ceph/) is employed for share storage, because it too can be made tolerant of a single failure.
!!! note
An exception to the 3-nodes decision is running a single-node configuration. If you only **have** one node, then obviously your swarm is only as resilient as that node. It's still a perfectly valid swarm configuration, ideal for starting your self-hosting journey. In fact, under the single-node configuration, you don't need ceph either, and you can simply use the local volume on your host for storage. You'll be able to migrate to ceph/more nodes if/when you expand.
**Where multiple solutions to a requirement exist, preference will be given to the most portable solution.**
@@ -26,30 +29,31 @@ This means that:
## Security
Under this design, the only inbound connections we're permitting to our docker swarm are:
Under this design, the only inbound connections we're permitting to our docker swarm in a **minimal** configuration (*you may add custom services later, like UniFi Controller*) are:
### Network Flows
* HTTP (TCP 80) : Redirects to https
* HTTPS (TCP 443) : Serves individual docker containers via SSL-encrypted reverse proxy
* **HTTP (TCP 80)** : Redirects to https
* **HTTPS (TCP 443)** : Serves individual docker containers via SSL-encrypted reverse proxy
### Authentication
* Where the proxied application provides a trusted level of authentication, or where the application requires public exposure,
* Where the hosted application provides a trusted level of authentication (*i.e., [NextCloud](/recipes/nextcloud/)*), or where the application requires public exposure (*i.e. [Privatebin](/recipes/privatebin/)*), no additional layer of authentication will be required.
* Where the hosted application provides inadequate (*i.e. [NZBGet](/recipes/autopirate/nzbget/)*) or no authentication (*i.e. [Gollum](/recipes/gollum/)*), a further authentication against an OAuth provider will be required.
## High availability
### Normal function
Assuming 3 nodes, under normal circumstances the following is illustrated:
Assuming a 3-node configuration, under normal circumstances the following is illustrated:
* All 3 nodes provide shared storage via GlusterFS, which is provided by a docker container on each node. (i.e., not running in swarm mode)
* All 3 nodes provide shared storage via Ceph, which is provided by a docker container on each node.
* All 3 nodes participate in the Docker Swarm as managers.
* The various containers belonging to the application "stacks" deployed within Docker Swarm are automatically distributed amongst the swarm nodes.
* Persistent storage for the containers is provide via GlusterFS mount.
* The **traefik** service (in swarm mode) receives incoming requests (on http and https), and forwards them to individual containers. Traefik knows the containers names because it's able to access the docker socket.
* All 3 nodes run keepalived, at different priorities. Since traefik is running as a swarm service and listening on TCP 80/443, requests made to the keepalived VIP and arriving at **any** of the swarm nodes will be forwarded to the traefik container (no matter which node it's on), and then onto the target backend.
* Persistent storage for the containers is provide via cephfs mount.
* The **traefik** service (*in swarm mode*) receives incoming requests (*on HTTP and HTTPS*), and forwards them to individual containers. Traefik knows the containers names because it's able to read the docker socket.
* All 3 nodes run keepalived, at varying priorities. Since traefik is running as a swarm service and listening on TCP 80/443, requests made to the keepalived VIP and arriving at **any** of the swarm nodes will be forwarded to the traefik container (*no matter which node it's on*), and then onto the target backend.
![HA function](../images/docker-swarm-ha-function.png)
@@ -57,9 +61,9 @@ Assuming 3 nodes, under normal circumstances the following is illustrated:
In the case of a failure (or scheduled maintenance) of one of the nodes, the following is illustrated:
* The failed node no longer participates in GlusterFS, but the remaining nodes provide enough fault-tolerance for the cluster to operate.
* The failed node no longer participates in Ceph, but the remaining nodes provide enough fault-tolerance for the cluster to operate.
* The remaining two nodes in Docker Swarm achieve a quorum and agree that the failed node is to be removed.
* The (possibly new) leader manager node reschedules the containers known to be running on the failed node, onto other nodes.
* The (*possibly new*) leader manager node reschedules the containers known to be running on the failed node, onto other nodes.
* The **traefik** service is either restarted or unaffected, and as the backend containers stop/start and change IP, traefik is aware and updates accordingly.
* The keepalived VIP continues to function on the remaining nodes, and docker swarm continues to forward any traffic received on TCP 80/443 to the appropriate node.
@@ -67,9 +71,9 @@ In the case of a failure (or scheduled maintenance) of one of the nodes, the fol
### Node restore
When the failed (or upgraded) host is restored to service, the following is illustrated:
When the failed (*or upgraded*) host is restored to service, the following is illustrated:
* GlusterFS regains full redundancy
* Ceph regains full redundancy
* Docker Swarm managers become aware of the recovered node, and will use it for scheduling **new** containers
* Existing containers which were migrated off the node are not migrated backend
* Keepalived VIP regains full redundancy
@@ -90,7 +94,7 @@ In summary, although I suffered an **unplanned power outage to all of my infrast
## Chef's Notes
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -4,21 +4,20 @@ For truly highly-available services with Docker containers, we need an orchestra
## Ingredients
* 3 x CentOS Atomic hosts (bare-metal or VMs). A reasonable minimum would be:
* 1 x vCPU
* 1GB repo_name
* 10GB HDD
* Hosts must be within the same subnet, and connected on a low-latency link (i.e., no WAN links)
!!! summary
* [X] 3 x modern linux hosts (*bare-metal or VMs*). A reasonable minimum would be:
* 2 x vCPU
* 2GB RAM
* 20GB HDD
* [X] Hosts must be within the same subnet, and connected on a low-latency link (*i.e., no WAN links*)
## Preparation
### Release the swarm!
Now, to launch my swarm:
Now, to launch a swarm. Pick a target node, and run `docker swarm init`
```docker swarm init```
Yeah, that was it. Now I have a 1-node swarm.
Yeah, that was it. Seriously. Now we have a 1-node swarm.
```
[root@ds1 ~]# docker swarm init
@@ -35,7 +34,7 @@ To add a manager to this swarm, run 'docker swarm join-token manager' and follow
[root@ds1 ~]#
```
Run ```docker node ls``` to confirm that I have a 1-node swarm:
Run `docker node ls` to confirm that you have a 1-node swarm:
```
[root@ds1 ~]# docker node ls
@@ -44,7 +43,7 @@ b54vls3wf8xztwfz79nlkivt8 * ds1.funkypenguin.co.nz Ready Active Leade
[root@ds1 ~]#
```
Note that when I ran ```docker swarm init``` above, the CLI output gave me a command to run to join further nodes to my swarm. This would join the nodes as __workers__ (as opposed to __managers__). Workers can easily be promoted to managers (and demoted again), but since we know that we want our other two nodes to be managers too, it's simpler just to add them to the swarm as managers immediately.
Note that when you run `docker swarm init` above, the CLI output gives youe a command to run to join further nodes to my swarm. This command would join the nodes as __workers__ (*as opposed to __managers__*). Workers can easily be promoted to managers (*and demoted again*), but since we know that we want our other two nodes to be managers too, it's simpler just to add them to the swarm as managers immediately.
On the first swarm node, generate the necessary token to join another manager by running ```docker swarm join-token manager```:
@@ -59,7 +58,7 @@ To add a manager to this swarm, run the following command:
[root@ds1 ~]#
```
Run the command provided on your second node to join it to the swarm as a manager. After adding the second node, the output of ```docker node ls``` (on either host) should reflect two nodes:
Run the command provided on your other nodes to join them to the swarm as managers. After addition of a node, the output of ```docker node ls``` (on either host) should reflect all the nodes:
````
@@ -70,19 +69,6 @@ xmw49jt5a1j87a6ihul76gbgy * ds2.funkypenguin.co.nz Ready Active Reach
[root@ds2 davidy]#
````
Repeat the process to add your third node.
Finally, ```docker node ls``` should reflect that you have 3 reachable manager nodes, one of whom is the "Leader":
```
[root@ds3 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
36b4twca7i3hkb7qr77i0pr9i ds1.example.com Ready Active Reachable
l14rfzazbmibh1p9wcoivkv1s * ds3.example.com Ready Active Reachable
tfsgxmu7q23nuo51wwa4ycpsj ds2.example.com Ready Active Leader
[root@ds3 ~]#
```
### Setup automated cleanup
Docker swarm doesn't do any cleanup of old images, so as you experiment with various stacks, and as updated containers are released upstream, you'll soon find yourself loosing gigabytes of disk space to old, unused images.
@@ -135,8 +121,8 @@ Create /var/data/config/shepherd/shepherd.env as follows:
```
# Don't auto-update Plex or Emby, I might be watching a movie! (Customize this for the containers you _don't_ want to auto-update)
BLACKLIST_SERVICES="plex_plex emby_emby"
# Run every 24 hours. I _really_ don't need new images more frequently than that!
SLEEP_TIME=1440
# Run every 24 hours. Note that SLEEP_TIME appears to be in seconds.
SLEEP_TIME=86400
```
Then create /var/data/config/shepherd/shepherd.yml as follows:
@@ -178,7 +164,7 @@ echo 'source ~/gcb-aliases.sh' >> ~/.bash_profile
## Chef's Notes
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -4,20 +4,21 @@ While having a self-healing, scalable docker swarm is great for availability and
In order to provide seamless external access to clustered resources, regardless of which node they're on and tolerant of node failure, you need to present a single IP to the world for external access.
Normally this is done using a HA loadbalancer, but since Docker Swarm aready provides the load-balancing capabilities (routing mesh), all we need for seamless HA is a virtual IP which will be provided by more than one docker node.
Normally this is done using a HA loadbalancer, but since Docker Swarm aready provides the load-balancing capabilities (*[routing mesh](https://docs.docker.com/engine/swarm/ingress/)*), all we need for seamless HA is a virtual IP which will be provided by more than one docker node.
This is accomplished with the use of keepalived on at least two nodes.
## Ingredients
```
Already deployed:
[X] At least 2 x CentOS/Fedora Atomic VMs
[X] low-latency link (i.e., no WAN links)
!!! summary "Ingredients"
Already deployed:
New:
[ ] 3 x IPv4 addresses (one for each node and one for the virtual IP)
```
* [X] At least 2 x swarm nodes
* [X] low-latency link (i.e., no WAN links)
New:
* [ ] At least 3 x IPv4 addresses (one for each node and one for the virtual IP)
## Preparation
@@ -66,10 +67,10 @@ That's it. Each node will talk to the other via unicast (no need to un-firewall
## Chef's notes
1. Some hosting platforms (OpenStack, for one) won't allow you to simply "claim" a virtual IP. Each node is only able to receive traffic targetted to its unique IP. In this case, keepalived is not the right solution, and a platform-specific load-balancing solution should be used. In OpenStack, this is Neutron's "Load Balancer As A Service" (LBAAS) component. AWS and Azure would likely include similar protections.
1. Some hosting platforms (*OpenStack, for one*) won't allow you to simply "claim" a virtual IP. Each node is only able to receive traffic targetted to its unique IP, unless certain security controls are disabled by the cloud administrator. In this case, keepalived is not the right solution, and a platform-specific load-balancing solution should be used. In OpenStack, this is Neutron's "Load Balancer As A Service" (LBAAS) component. AWS, GCP and Azure would likely include similar protections.
2. More than 2 nodes can participate in keepalived. Simply ensure that each node has the appropriate priority set, and the node with the highest priority will become the master.
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -112,7 +112,7 @@ systemctl restart docker-latest
## Chef's notes
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -196,7 +196,7 @@ Future enhancements to this recipe include:
1. Rather than pasting a secret key into /etc/fstab (which feels wrong), I'd prefer to be able to set "secretfile" in /etc/fstab (which just points ceph.mount to a file containing the secret), but under the current CentOS Atomic, we're stuck with "secret", per https://bugzilla.redhat.com/show_bug.cgi?id=1030402
2. This recipe was written with Ceph v11 "Jewel". Ceph have subsequently releaesd v12 "Kraken". I've updated the recipe for the addition of "Manager" daemons, but it should be noted that the [only reader so far](https://discourse.geek-kitchen.funkypenguin.co.nz/u/ggilley) to attempt a Ceph install using CentOS Atomic and Ceph v12 had issues with OSDs, which lead him to [move to Ubuntu 1604](https://discourse.geek-kitchen.funkypenguin.co.nz/t/shared-storage-ceph-funky-penguins-geek-cookbook/47/24?u=funkypenguin) instead.
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -2,6 +2,9 @@
While Docker Swarm is great for keeping containers running (_and restarting those that fail_), it does nothing for persistent storage. This means if you actually want your containers to keep any data persistent across restarts (_hint: you do!_), you need to provide shared storage to every docker node.
!!! warning
This recipe is deprecated. It didn't work well in 2017, and it's not likely to work any better now. It remains here as a reference. I now recommend the use of [Ceph for shared storage](/ha-docker-swarm/shared-storage-ceph/) instead. - 2019 Chef
## Design
### Why GlusterFS?
@@ -163,7 +166,7 @@ Future enhancements to this recipe include:
1. Migration of shared storage from GlusterFS to Ceph ()[#2](https://gitlab.funkypenguin.co.nz/funkypenguin/geeks-cookbook/issues/2))
2. Correct the fact that volumes don't automount on boot ([#3](https://gitlab.funkypenguin.co.nz/funkypenguin/geeks-cookbook/issues/3))
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -103,7 +103,7 @@ What have we achieved? By adding an additional three simple labels to any servic
3. I reviewed several implementations of forward authenticators for Traefik, but found most to be rather heavy-handed, or specific to a single auth provider. @thomaseddon's go-based docker image is 7MB in size, and with the generic OIDC patch (above), it can be extended to work with any OIDC provider.
4. No, not github natively, but you can ferderate GitHub into KeyCloak, and then use KeyCloak as the OIDC provider.
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -11,15 +11,24 @@ There are some gaps to this approach though:
To deal with these gaps, we need a front-end load-balancer, and in this design, that role is provided by [Traefik](https://traefik.io/).
![Traefik Screenshot](../images/traefik.png)
## Ingredients
1. [Docker swarm cluster](/ha-docker-swarm/design/) with [persistent shared storage](/ha-docker-swarm/shared-storage-ceph)
!!! summary "You'll need"
Already deployed:
* [X] [Docker swarm cluster](/ha-docker-swarm/design/) with [persistent shared storage](/ha-docker-swarm/shared-storage-ceph)
New to this recipe:
* [ ] Access to update your DNS records for manual/automated [LetsEncrypt](https://letsencrypt.org/docs/challenge-types/) DNS-01 validation, or ingress HTTP/HTTPS for HTTP-01 validation
## Preparation
### Prepare the host
The traefik container is aware of the __other__ docker containers in the swarm, because it has access to the docker socket at **/var/run/docker.sock**. This allows traefik to dynamically configure itself based on the labels found on containers in the swarm, which is hugely useful. To make this functionality work on our SELinux-enabled Atomic hosts, we need to add custom SELinux policy.
The traefik container is aware of the __other__ docker containers in the swarm, because it has access to the docker socket at **/var/run/docker.sock**. This allows traefik to dynamically configure itself based on the labels found on containers in the swarm, which is hugely useful. To make this functionality work on a SELinux-enabled CentOS7 host, we need to add custom SELinux policy.
!!! tip
The following is only necessary if you're using SELinux!
@@ -38,9 +47,9 @@ make && semodule -i dockersock.pp
### Prepare traefik.toml
While it's possible to configure traefik via docker command arguments, I prefer to create a config file (traefik.toml). This allows me to change traefik's behaviour by simply changing the file, and keeps my docker config simple.
While it's possible to configure traefik via docker command arguments, I prefer to create a config file (`traefik.toml`). This allows me to change traefik's behaviour by simply changing the file, and keeps my docker config simple.
Create ```/var/data/traefik/```, and then create ```traefik.toml``` inside it as follows:
Create `/var/data/traefik/traefik.toml` as follows:
```
checkNewVersion = true
@@ -81,14 +90,43 @@ watch = true
swarmmode = true
```
### Prepare the docker service config
!!! tip
"We'll want an overlay network, independent of our traefik stack, so that we can attach/detach all our other stacks (including traefik) to the overlay network. This way, we can undeploy/redepoly the traefik stack without having to bring every other stack first!" - voice of experience
Create `/var/data/config/traefik/traefik.yml` as follows:
```
version: "3.2"
# What is this?
#
# This stack exists solely to deploy the traefik_public overlay network, so that
# other stacks (including traefik-app) can attach to it
services:
scratch:
image: scratch
deploy:
replicas: 0
networks:
- public
networks:
public:
driver: overlay
attachable: true
ipam:
config:
- subnet: 172.16.200.0/24
```
!!! tip
I share (_with my [patreon patrons](https://www.patreon.com/funkypenguin)_) a private "_premix_" git repository, which includes necessary docker-compose and env files for all published recipes. This means that patrons can launch any recipe with just a ```git pull``` and a ```docker stack deploy``` 👍
Create /var/data/config/traefik/docker-compose.yml as follows:
Create `/var/data/config/traefik/traefik-app.yml` as follows:
```
version: "3"
@@ -97,6 +135,10 @@ services:
traefik:
image: traefik
command: --web --docker --docker.swarmmode --docker.watch --docker.domain=example.com --logLevel=DEBUG
# Note below that we use host mode to avoid source nat being applied to our ingress HTTP/HTTPS sessions
# Without host mode, all inbound sessions would have the source IP of the swarm nodes, rather than the
# original source IP, which would impact logging. If you don't care about this, you can expose ports the
# "minimal" way instead
ports:
- target: 80
published: 80
@@ -111,60 +153,92 @@ services:
protocol: tcp
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /var/data/traefik/traefik.toml:/traefik.toml:ro
- /var/data/config/traefik:/etc/traefik
- /var/data/traefik/traefik.log:/traefik.log
- /var/data/traefik/acme.json:/acme.json
labels:
- "traefik.enable=false"
networks:
- public
# Global mode makes an instance of traefik listen on _every_ node, so that regardless of which
# node the request arrives on, it'll be forwarded to the correct backend service.
deploy:
labels:
- "traefik.enable=false"
mode: global
placement:
constraints: [node.role == manager]
restart_policy:
condition: on-failure
condition: on-failure
networks:
public:
driver: overlay
ipam:
driver: default
config:
- subnet: 10.1.0.0/24
traefik_public:
external: true
```
Docker won't start an image with a bind-mount to a non-existent file, so prepare an empty acme.json (_with the appropriate permissions_) by running:
Docker won't start a service with a bind-mount to a non-existent file, so prepare an empty acme.json (_with the appropriate permissions_) by running:
```
touch /var/data/traefik/acme.json
chmod 600 /var/data/traefik/acme.json
```
!!! warning
Pay attention above. You **must** set `acme.json`'s permissions to owner-readable-only, else the container will fail to start with an [ID-10T](https://en.wikipedia.org/wiki/User_error#ID-10-T_error) error!
Traefik will populate acme.json itself when it runs, but it needs to exist before the container will start (_Chicken, meet egg._)
### Launch
Deploy traefik with ```docker stack deploy traefik -c /var/data/traefik/docker-compose.yml```
Confirm traefik is running with ```docker stack ps traefik```
## Serving
You now have:
### Launch
1. Frontend proxy which will dynamically configure itself for new backend containers
2. Automatic SSL support for all proxied resources
First, launch the traefik stack, which will do nothing other than create an overlay network by running `docker stack deploy traefik -c /var/data/traefik/traefik.yml`
```
[root@kvm ~]# docker stack deploy traefik -c traefik.yml
Creating network traefik_public
Creating service traefik_scratch
[root@kvm ~]#
```
Now deploy the traefik appliation itself (*which will attach to the overlay network*) by running `docker stack deploy traefik-app -c /var/data/traefik/traefik-app.yml`
```
[root@kvm ~]# docker stack deploy traefik-app -c traefik-app.yml
Creating service traefik-app_app
[root@kvm ~]#
```
Confirm traefik is running with `docker stack ps traefik-app`:
```
[root@kvm ~]# docker stack ps traefik-app
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
74uipz4sgasm traefik-app_app.t4vcm8siwc9s1xj4c2o4orhtx traefik:alpine kvm.funkypenguin.co.nz Running Running 33 seconds ago *:443->443/tcp,*:80->80/tcp
[root@kvm ~]#
```
### Check Traefik Dashboard
You should now be able to access your traefik instance on http://<node IP\>:8080 - It'll look a little lonely currently (*below*), but we'll populate it as we add recipes :)
![Screenshot of Traefik, post-launch](/images/traefik-post-launch.png)
### Summary
!!! summary
We've achieved:
* [X] An overlay network to permit traefik to access all future stacks we deploy
* [X] Frontend proxy which will dynamically configure itself for new backend containers
* [X] Automatic SSL support for all proxied resources
## Chef's Notes
## Chef's Notes
Additional features I'd like to see in this recipe are:
1. Did you notice how no authentication was required to view the Traefik dashboard? Eek! We'll tackle that in the next section, regarding [Traefik Forward Authentication](/ha-docker-swarm/traefik-forward-auth/)!
1. Include documentation of oauth2_proxy container for protecting individual backends
2. Traefik webUI is available via HTTPS, protected with oauth_proxy
3. Pending a feature in docker-swarm to avoid NAT on routing-mesh-delivered traffic, update the design
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏

View File

@@ -1,21 +1,17 @@
# Virtual Machines
Let's start building our cloud with virtual machines. You could use bare-metal machines as well, the configuration would be the same. Given that most readers (myself included) will be using virtual infrastructure, from now on I'll be referring strictly to VMs.
I chose the "[Atomic](https://www.projectatomic.io/)" CentOS/Fedora image for the VM layer because:
1. I want less responsibility for maintaining the system, including ensuring regular software updates and reboots. Atomic's idempotent nature means the OS is largely read-only, and updates/rollbacks are "atomic" (haha) procedures, which can be easily rolled back if required.
2. For someone used to administrating servers individually, Atomic is a PITA. You have to employ [tricky](https://spinningmatt.wordpress.com/2014/01/08/a-recipe-for-starting-cloud-images-with-virt-install/) [tricks](http://blog.oddbit.com/2015/03/10/booting-cloud-images-with-libvirt/) to get it to install in a non-cloud environment. It's not designed for tweaking or customizing beyond what cloud-config is capable of. For my purposes, this is good, because it forces me to change my thinking - to consider every daemon as a container, and every config as code, to be checked in and version-controlled. Atomic forces this thinking on you.
3. I want the design to be as "portable" as possible. While I run it on VPSs now, I may want to migrate it to a "cloud" provider in the future, and I'll want the most portable, reproducible design.
Let's start building our cluster. You can use either bare-metal machines or virtual machines - the configuration would be the same. Given that most readers (myself included) will be using virtual infrastructure, from now on I'll be referring strictly to VMs.
!!! note
In 2017, I **initially** chose the "[Atomic](https://www.projectatomic.io/)" CentOS/Fedora image for the swarm hosts, but later found its outdated version of Docker to be problematic with advanced features like GPU transcoding (in [Plex](/recipes/plex/)), [Swarmprom](/recipes/swarmprom/), etc. In the end, I went mainstream and simply preferred a modern Ubuntu installation.
## Ingredients
!!! summary "Ingredients"
3 x Virtual Machines, each with:
* [ ] CentOS/Fedora Atomic
* [ ] At least 1GB RAM
* [ ] A mainstream Linux OS (*tested on either [CentOS](https://www.centos.org) 7+ or [Ubuntu](http://releases.ubuntu.com) 16.04+*)
* [ ] At least 2GB RAM
* [ ] At least 20GB disk space (_but it'll be tight_)
* [ ] Connectivity to each other within the same subnet, and on a low-latency link (_i.e., no WAN links_)
@@ -30,25 +26,13 @@ I chose the "[Atomic](https://www.projectatomic.io/)" CentOS/Fedora image for th
!!! tip
If you're not using a platform with cloud-init support (i.e., you're building a VM manually, not provisioning it through a cloud provider), you'll need to refer to [trick #1](https://spinningmatt.wordpress.com/2014/01/08/a-recipe-for-starting-cloud-images-with-virt-install/) and [trick #2](http://blog.oddbit.com/2015/03/10/booting-cloud-images-with-libvirt/) for a means to override the automated setup, apply a manual password to the CentOS account, and enable SSH password logins.
### Permit connectivity between hosts
### Prefer docker-latest
Most modern Linux distributions include firewall rules which only only permit minimal required incoming connections (like SSH). We'll want to allow all traffic between our nodes. The steps to achieve this in CentOS/Ubuntu are a little different...
Run the following on each node to replace the default docker 1.12 with docker 1.13 (_which we need for swarm mode_):
```
systemctl disable docker --now
systemctl enable docker-latest --now
sed -i '/DOCKERBINARY/s/^#//g' /etc/sysconfig/docker
```
#### CentOS
### Upgrade Atomic
Finally, apply any Atomic host updates, and reboot, by running: ```atomic host upgrade && systemctl reboot```.
### Permit connectivity between VMs
By default, Atomic only permits incoming SSH. We'll want to allow all traffic between our nodes, so add something like this to /etc/sysconfig/iptables:
Add something like this to `/etc/sysconfig/iptables`:
```
# Allow all inter-node communication
@@ -57,6 +41,17 @@ By default, Atomic only permits incoming SSH. We'll want to allow all traffic be
And restart iptables with ```systemctl restart iptables```
#### Ubuntu
Install the (*non-default*) persistent iptables tools, by running `apt-get install iptables-persistent`, establishing some default rules (*dkpg will prompt you to save current ruleset*), and then add something like this to `/etc/iptables/rules.v4`:
```
# Allow all inter-node communication
-A INPUT -s 192.168.31.0/24 -j ACCEPT
```
And refresh your running iptables rules with `iptables-restore < /etc/iptables/rules.v4`
### Enable host resolution
Depending on your hosting environment, you may have DNS automatically setup for your VMs. If not, it's useful to set up static entries in /etc/hosts for the nodes. For example, I setup the following:
@@ -80,13 +75,12 @@ ln -sf /usr/share/zoneinfo/<your timezone> /etc/localtime
After completing the above, you should have:
```
[X] 3 x fresh atomic instances, at the latest releases,
running Docker v1.13 (docker-latest)
[X] 3 x fresh linux instances, ready to become swarm nodes
```
## Chef's Notes
### Tip your waiter (donate) 👏
### Tip your waiter (support me) 👏
Did you receive excellent service? Want to make your waiter happy? (_..and support development of current and future recipes!_) See the [support](/support/) page for (_free or paid)_ ways to say thank you! 👏