Categories
Cloud Linux

Kubernetes Home Lab (part 2)

Bootstrap the cluster with kubeadm:

Following on from part one, we will create our new Kubernetes cluster.

Our Kubernetes Cluster Topology:

A single master/control plane node and two worker nodes:

Deployment Steps

To deploy the cluster, we will first take care of the prerequisites, install docker and install kubeadm, following the Kubernetes documentation:

Order of deployment steps

Installing kubeadm, kubelet and kubectl

We will install these packages on all the machines:

  • kubeadm: the command to bootstrap the cluster.
  • kubelet: the component that runs on all of the machines in your cluster and does things like starting pods and containers.
  • kubectl: the command line util to talk to your cluster.

For specific versions:

sudo apt-get install -y kubelet=1.21.0-00 kubeadm=1.21.0-00 kubectl=1.21.0-00

Initialize / Create the Cluster:

We can then create our cluster with ‘kubeadm init’

Bootstrap – Attempt 1

My first attempt failed due to a cgroups_memory issue:

root@k8s-master01:~# kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.0.185
[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
[preflight] The system verification failed. Printing the output from the verification:
root@k8s-master01:~# docker info | head
Client:
Context:    default
Debug Mode: false
Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
Containers: 2
  Running: 0
root@k8s-master01:~# docker info | grep -i cgroup
Cgroup Driver: systemd
Cgroup Version: 1
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory TCP limit support
WARNING: No oom kill disable support

cgroups Fix

The fix is described here https://phabricator.wikimedia.org/T122734.

As there is no grub with Ubuntu on Raspberry Pi (https://unix.stackexchange.com/questions/475973/cant-find-etc-default-grub) I simply had to edit /boot/firmware/cmdline.txt and reboot.

root@k8s-master01:~# cat /boot/firmware/cmdline.txt
net.ifnames=0 dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc cgroup_enable=memory swapaccount=1

After reboot:

istacey@k8s-master01:~$ uptime

20:54:27 up 1 min,  1 user,  load average: 1.36, 0.52, 0.19
istacey@k8s-master01:~$ cat /proc/cmdline
coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 bcm2708_fb.fbwidth=0 bcm2708_fb.fbheight=0 bcm2708_fb.fbswap=1 smsc95xx.macaddr=DC:A6:32:02:F0:6E vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000  net.ifnames=0 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc cgroup_enable=memory swapaccount=1 quiet splash

Bootstrap – Attempt 2

Second attempt is successful:

root@k8s-master01:~# kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.0.185
[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: missing optional cgroups: hugetlb
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.185]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.0.185 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.0.185 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 28.511339 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master01 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-master01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: oq7hb9.vtmiw210ozvi2grh
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.185:6443 --token oq7hb9.vtmiw210ozvi2grh \
        --discovery-token-ca-cert-hash sha256:c87681fc7fec18f015f974e558d8436113019fefbf91123bb5c5190466b5854d
root@k8s-master01:~#

Install the pod network add-on (CNI)

We will use Weave for this cluster

https://kubernetes.io/docs/concepts/cluster-administration/networking/#how-to-implement-the-kubernetes-networking-model

https://www.weave.works/docs/net/latest/kubernetes/
https://www.weave.works/docs/net/latest/kubernetes/kube-addon/

istacey@k8s-master01:~$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created
istacey@k8s-master01:~$

Check Nodes are ready and pods running:

istacey@k8s-master01:~$ kubectl get nodes
NAME           STATUS   ROLES                  AGE   VERSION
k8s-master01   Ready    control-plane,master   13m   v1.22.2

istacey@k8s-master01:~$ kubectl get pods -A
NAMESPACE     NAME                                   READY   STATUS    RESTARTS      AGE
kube-system   coredns-78fcd69978-cnql6               1/1     Running   0             14m
kube-system   coredns-78fcd69978-k4bnk               1/1     Running   0             14m
kube-system   etcd-k8s-master01                      1/1     Running   0             14m
kube-system   kube-apiserver-k8s-master01            1/1     Running   0             14m
kube-system   kube-controller-manager-k8s-master01   1/1     Running   0             14m
kube-system   kube-proxy-lx8bj                       1/1     Running   0             14m
kube-system   kube-scheduler-k8s-master01            1/1     Running   0             14m
kube-system   weave-net-f7f7h                        2/2     Running   1 (99s ago)   2m2s

Join the two Worker Nodes:

To get the join token:

kubeadm token create --help
kubeadm token create --print-join-command

Check once the two nodes are joined

istacey@k8s-master01:~$ kubectl get nodes
NAME           STATUS   ROLES                  AGE    VERSION
k8s-master01   Ready    control-plane,master   18m    v1.22.2
k8s-worker01   Ready    <none>                 2m4s   v1.22.2
k8s-worker02   Ready    <none>                 51s    v1.22.2

istacey@k8s-master01:~$ kubectl get ds -n kube-system
NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-proxy   3         3         3       3            3           kubernetes.io/os=linux   18m
weave-net    3         3         3       3            3           <none>                   6m10s

Quick Test:

istacey@k8s-master01:~$ kubectl run nginx --image=nginx
pod/nginx created
istacey@k8s-master01:~$ kubectl get pods -o wide
NAME    READY   STATUS              RESTARTS   AGE   IP       NODE           NOMINATED NODE   READINESS GATES
nginx   0/1     ContainerCreating   0          23s   <none>   k8s-worker01   <none>           <none>
istacey@k8s-master01:~$ kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP          NODE           NOMINATED NODE   READINESS GATES
nginx   1/1     Running   0          27s   10.44.0.1   k8s-worker01   <none>           <none>

istacey@k8s-master01:~$ kubectl delete po nginx
pod "nginx" deleted

Part 1 here:

Categories
Cloud Linux

Extend Ceph Storage for Kubernetes Cluster

Scenario:

4 worker nodes with 25GB raw disk used in a ceph block cluster. As we are running low on space, we will extend the raw disks to 50GB and update rook-ceph accordingly.

Ceph OSD Management

Ceph Object Storage Daemons (OSDs) are the heart and soul of the Ceph storage platform. Each OSD manages a local device and together they provide the distributed storage. Rook will automate creation and management of OSDs to hide the complexity based on the desired state in the CephCluster CR as much as possible. This guide will walk through some of the scenarios to configure OSDs where more configuration may be required.

OSD Health

The rook-ceph-tools pod provides a simple environment to run Ceph tools. The ceph commands mentioned in this document should be run from the toolbox.

Once the is created, connect to the pod to execute the ceph commands to analyze the health of the cluster, in particular the OSDs and placement groups (PGs). Some common commands to analyze OSDs include:

ceph status
ceph osd tree
ceph osd status
ceph osd df
ceph osd utilization

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

Status Before:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status                                                                        cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 13h)
    mgr: a(active, since 13h)
    osd: 4 osds: 4 up (since 13h), 4 in (since 13h)

  data:
    pools:   2 pools, 129 pgs
    objects: 5.12k objects, 19 GiB
    usage:   63 GiB used, 37 GiB / 100 GiB avail
    pgs:     129 active+clean

  io:
    client:   60 KiB/s wr, 0 op/s rd, 1 op/s wr


[istacey@master001 ~]$ kubectl --kubeconfig=/home/istacey/.kube/config-hr -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r -- ceph osd  status
ID  HOST        USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
 0  worker002  10.9G  14.0G      1      151k      0        0   exists,up
 1  worker003  10.9G  14.0G      0        0       0        0   exists,up
 2  worker004  11.1G  13.8G      0        0       0        0   exists,up
 3  worker001  9.98G  15.0G      0        0       0        0   exists,up


openet@worker001:~$ lsblk | grep sdb -A1
sdb                                                                                                     8:16   0   25G  0 disk
└─ceph--f067bb6e--522a--48c6--a2a8--8930d15dc02f-osd--block--dc871464--0a16--484a--8fa8--b723eec178f1 253:10   0   25G  0 lvm

Raw Disk Extended:

openet@worker001:~$ lsblk | grep sdb -A2 
sdb                                                             8:16   0   50G  0 disk
└─ceph--f067bb6e--522a--48c6--a2a8--8930d15dc02f-osd--block--dc871464--0a16--484a--8fa8--b723eec178f1
                                                              253:10   0   25G  0 lvm

Remove the OSDs (one at a time):

https://github.com/rook/rook/blob/master/Documentation/ceph-osd-mgmt.md#remove-an-osd

To remove an OSD due to a failed disk or other re-configuration, consider the following to ensure the health of the data through the removal process:

  • Confirm you will have enough space on your cluster after removing your OSDs to properly handle the deletion
  • Confirm the remaining OSDs and their placement groups (PGs) are healthy in order to handle the rebalancing of the data
  • Do not remove too many OSDs at once
  • Wait for rebalancing between removing multiple OSDs

If all the PGs are active+clean and there are no warnings about being low on space, this means the data is fully replicated and it is safe to proceed. If an OSD is failing, the PGs will not be perfectly clean and you will need to proceed anyway.

Scale down rook-ceph-operator and the OSD deployments:

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep opera
rook-ceph-operator                   1/1     1            1           77d

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
deployment.apps/rook-ceph-operator scaled

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep opera
rook-ceph-operator                   0/0     0            0           77d

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status | egrep 'health|osds|usage'
    health: HEALTH_OK
    osd: 4 osds: 4 up (since 13h), 4 in (since 13h)
    usage:   63 GiB used, 37 GiB / 100 GiB avail

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep osd
rook-ceph-osd-0                      1/1     1            1           38h
rook-ceph-osd-1                      1/1     1            1           77d
rook-ceph-osd-2                      1/1     1            1           77d
rook-ceph-osd-3                      1/1     1            1           77d

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-osd-0 --replicas=0
deployment.apps/rook-ceph-osd-0 scaled

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep osd
rook-ceph-osd-0                      0/0     0            0           38h
rook-ceph-osd-1                      1/1     1            1           77d
rook-ceph-osd-2                      1/1     1            1           77d
rook-ceph-osd-3                      1/1     1            1           77d

Down and out the OSD

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd down osd.0
osd.0 is already down.

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status | egrep 'health|osds|usage'
    health: HEALTH_WARN
            1 osds down
            1 host (1 osds) down
    osd: 4 osds: 3 up (since 101s), 4 in (since 13h); 1 remapped pgs
    usage:   63 GiB used, 37 GiB / 100 GiB avail

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.09760  root default
-9         0.02440      host worker001
 3    hdd  0.02440          osd.3           up   1.00000  1.00000
-3         0.02440      host worker002
 0    hdd  0.02440          osd.0         down   1.00000  1.00000
-5         0.02440      host worker003
 1    hdd  0.02440          osd.1           up   1.00000  1.00000
-7         0.02440      host worker004
 2    hdd  0.02440          osd.2           up   1.00000  1.00000

### Mark the OSD as out:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd out osd.0
marked out osd.0.

Wait for the data to finish backfilling to other OSDs.

ceph status will indicate the backfilling is done when all of the PGs are active+clean.

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status
  cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_WARN
            Degraded data redundancy: 3171/15372 objects degraded (20.628%), 80 pgs degraded, 80 pgs undersized

  services:
    mon: 3 daemons, quorum a,b,c (age 13h)
    mgr: a(active, since 13h)
    osd: 4 osds: 3 up (since 4m), 3 in (since 96s); 80 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 5.12k objects, 19 GiB
    usage:   50 GiB used, 25 GiB / 75 GiB avail
    pgs:     3171/15372 objects degraded (20.628%)
             78 active+undersized+degraded+remapped+backfill_wait
             49 active+clean
             2  active+undersized+degraded+remapped+backfilling

  io:
    client:   71 KiB/s wr, 0 op/s rd, 1 op/s wr
    recovery: 6.5 MiB/s, 1 objects/s

### backfilling is done:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status | egrep 'health|osds|usage'
    health: HEALTH_OK
    osd: 4 osds: 3 up (since 22m), 3 in (since 19m)
    usage:   62 GiB used, 13 GiB / 75 GiB avail

Remove the OSD from the Ceph cluster

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd purge osd.0 --yes-i-really-mean-it
purged osd.0

### Note osd.0 is on worker002:

[istacey@master001 ~]$ kubectl get pods -n rook-ceph -o wide | grep osd | grep -v prepare
rook-ceph-osd-1-6c468554f4-8btvj                      1/1     Running     3          26h   10.42.171.207    worker003   <none>           <none>
rook-ceph-osd-2-5f8ffcd5bb-p44d4                      1/1     Running     1          25h   10.42.64.205     worker004   <none>           <none>
rook-ceph-osd-3-5d8b989cb-4hf8h                       1/1     Running     5          27h   10.42.7.26       worker001   <none>           <none>

Zap the disk

https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md#zapping-devices

As root clean and Prepare the disk on the VM:

DISK="/dev/sdb"
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
rm -rf /dev/ceph-*
rm -rf /dev/mapper/ceph--*
partprobe $DISK


[root@worker002 ~]# lsblk

NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sdb                   8:16   0   50G  0 disk

Scale back up and let osd rejoin:

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
deployment.apps/rook-ceph-operator scaled

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-osd-0 --replicas=1
deployment.apps/rook-ceph-osd-0 scaled

[istacey@master001 ~]$ kubectl -n rook-ceph get deployment | egrep 'rook-ceph-operator|rook-ceph-osd'
rook-ceph-operator                   1/1     1            1           77d
rook-ceph-osd-0                      1/1     1            1           39h
rook-ceph-osd-1                      1/1     1            1           77d
rook-ceph-osd-2                      1/1     1            1           77d
rook-ceph-osd-3                      1/1     1            1           77d

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.12199  root default
-9         0.02440      host worker001
 3    hdd  0.02440          osd.3           up   1.00000  1.00000
-3         0.04880      host worker002
 0    hdd  0.04880          osd.0           up   1.00000  1.00000
-5         0.02440      host worker003
 1    hdd  0.02440          osd.1           up   1.00000  1.00000
-7         0.02440      host worker004
 2    hdd  0.02440          osd.2           up   1.00000  1.00000

openet@worker002:~$ lsblk | grep sdb -A1
sdb                                                                                                     8:16   0   50G  0 disk
└─ceph--ea8115b7--5418--41b9--b4d3--d6e22526dbb1-osd--block--68cfcb49--f858--46f2--979f--dc266e4e6cf0 253:10   0   50G  0 lvm

Wait for rebalance…

Rebalancing done….

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status
  cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 14h)
    mgr: a(active, since 14h)
    osd: 4 osds: 4 up (since 32m), 4 in (since 32m)

  task status:

  data:
    pools:   2 pools, 129 pgs
    objects: 5.12k objects, 19 GiB
    usage:   63 GiB used, 62 GiB / 125 GiB avail
    pgs:     129 active+clean

  io:
    client:   73 KiB/s wr, 0 op/s rd, 2 op/s wr

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r -- ceph df
--- RAW STORAGE ---
CLASS  SIZE     AVAIL   USED    RAW USED  %RAW USED
hdd    125 GiB  62 GiB  59 GiB    63 GiB      50.14
TOTAL  125 GiB  62 GiB  59 GiB    63 GiB      50.14

--- POOLS ---
POOL                   ID  PGS  STORED  OBJECTS  USED    %USED  MAX AVAIL
device_health_metrics   1    1     0 B        0     0 B      0     15 GiB
replicapool             3  128  19 GiB    5.12k  58 GiB  55.89     15 GiB

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd status
ID  HOST        USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
 0  worker002  20.5G  29.4G      1      156k      0        0   exists,up
 1  worker003  13.2G  11.7G      0        0       0        0   exists,up
 2  worker004  14.3G  10.6G      0        0       0        0   exists,up
 3  worker001  14.5G  10.4G      0     4095       0        0   exists,up

Repeat for next 3 OSDs…

The operator ideally will automatically create the new OSD within a few minutes of adding the new device or updating the CR. If you don’t see a new OSD automatically created, restart the operator (by deleting the operator pod) to trigger the OSD creation.

Extra step after hitting an issue:

Pod in error and storage not available on node, edit with kubectl after scaling operations

### Edit with kubectl and remove node:

kubectl edit CephCluster rook-ceph -n rook-ceph 

    - deviceFilter: sdb
      name: worker001
      resources: {}

End result:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status
  cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 3h)
    mgr: a(active, since 22h)
    osd: 4 osds: 4 up (since 94m), 4 in (since 94m)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 5.12k objects, 19 GiB
    usage:   63 GiB used, 137 GiB / 200 GiB avail
    pgs:     33 active+clean
 
  io:
    client:   49 KiB/s wr, 0 op/s rd, 1 op/s wr
 
[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd status
ID  HOST        USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
0  worker002  12.7G  37.2G      0        0       0        0   exists,up
1  worker003  16.2G  33.7G      1     24.7k      0        0   exists,up
2  worker004  15.2G  34.7G      0        0       0        0   exists,up
3  worker001  18.6G  31.3G      0      819       0        0   exists,up 

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.09760  root default
-9         0.02440      host worker001
 3    hdd  0.02440          osd.3           up   1.00000  1.00000
-3         0.02440      host worker002
 0    hdd  0.02440          osd.0           up   1.00000  1.00000
-5         0.02440      host worker003
 1    hdd  0.02440          osd.1           up   1.00000  1.00000
-7         0.02440      host worker004
 2    hdd  0.02440          osd.2           up   1.00000  1.00000

References:

https://github.com/rook/rook/blob/master/Documentation/ceph-osd-mgmt.md#remove-an-osd

https://github.com/rook/rook/issues/2997

https://docs.ceph.com/en/mimic/rados/operations/add-or-rm-osds/

https://www.cloudops.com/blog/the-ultimate-rook-and-ceph-survival-guide/

Categories
Linux

Basic HA NFS Server with Keepalived

Aim

Create a simple NFS HA cluster on RHEL7 VMs with local storage as shown below. The VMs run as guests on a RHEL 8 server running KVM. Connections with be made from the local network and pods running in a Kubernetes cluster will mount as PersistentVolumes.

Also I will create a SFTP chroot jail for incoming client sftp connections.

Logical diagram

Prerequisites

Install and enable nfs, keepalived and rsync packages
sudo yum install -y nfs-utils keepalived rsync

sudo systemctl enable nfs-server
sudo systemctl enable keepalived

keepalived --version
Get IP info
[istacey@nfs-server01 ~]$ ip --brief a s
lo               UNKNOWN        127.0.0.1/8 ::1/128
eth0             UP             10.12.6.111/25 fe80::5054:ff:fe79:79b3/64
eth1             UP             192.168.112.111/24 fe80::5054:ff:fe06:8dc5/64
eth2             UP             10.12.8.103/28 fe80::5054:ff:fec6:428f/64

[istacey@nfs-server02 ~]$ ip --brief a s
lo               UNKNOWN        127.0.0.1/8 ::1/128
eth0             UP             10.12.6.112/25 fe80::5054:ff:fef5:765e/64
eth1             UP             192.168.112.112/24 fe80::5054:ff:fead:fa64/64
eth2             UP             10.12.8.104/28 fe80::5054:ff:fef5:13de/64

VIP DETAILS:
VIP – NSF nfsvip 10.12.8.102
NSF_01 nfs-server01 10.12.8.103
NSF_02 nfs-server02 10.12.8.104

Configure keepalived

Server 1

[istacey@nfs-server01 ~]$ cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

vrrp_instance VI_1 {
    state MASTER
    interface eth2
    virtual_router_id 51
    priority 255
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.12.8.102
    }
}

Server 2:

[istacey@nfs-server02 ~]$ cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

vrrp_instance VI_1 {
    state BACKUP
    interface eth2
    virtual_router_id 51
    priority 254
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.12.8.102
    }
}

Test with ping:


[istacey@nfs-server02 ~]$ ping 10.12.8.102
PING 10.12.8.102 (10.12.8.102) 56(84) bytes of data.
From 10.12.8.104 icmp_seq=1 Destination Host Unreachable
From 10.12.8.104 icmp_seq=2 Destination Host Unreachable
From 10.12.8.104 icmp_seq=3 Destination Host Unreachable
From 10.12.8.104 icmp_seq=4 Destination Host Unreachable
^C
--- 10.12.8.102 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 2999ms
pipe 4
[istacey@nfs-server02 ~]$


[istacey@nfs-server01 ~]$ sudo systemctl start  keepalived
[istacey@nfs-server02 ~]$ sudo systemctl start  keepalived

[istacey@nfs-server02 ~]$ ping 10.12.8.102
PING 10.12.8.102 (10.12.8.102) 56(84) bytes of data.
64 bytes from 10.12.8.102: icmp_seq=1 ttl=64 time=0.123 ms
64 bytes from 10.12.8.102: icmp_seq=2 ttl=64 time=0.116 ms
64 bytes from 10.12.8.102: icmp_seq=3 ttl=64 time=0.104 ms

Show VIP:

[istacey@nfs-server01 ~]$ ip --brief a s
lo               UNKNOWN        127.0.0.1/8 ::1/128
eth0             UP             10.12.6.111/25 fe80::5054:ff:fe79:79b3/64
eth1             UP             192.168.112.111/24 fe80::5054:ff:fe06:8dc5/64
eth2             UP             10.12.8.103/28 10.12.8.102/32 fe80::5054:ff:fec6:428f/64
[istacey@nfs-server01 ~]$

Create SFTP chroot jail

Create the sftpusers group and the user on both servers:

[istacey@nfs-server01 ~]$ sudo groupadd -g 15000 nfsrsync
[istacey@nfs-server01 ~]$ sudo groupadd -g 15001 vmnfs1
[istacey@nfs-server01 ~]$ sudo groupadd -g 15002 sftpusers

[istacey@nfs-server01 ~]$ sudo useradd -u 15000 -g nfsrsync nfsrsync
[istacey@nfs-server01 ~]$ sudo useradd -u 15001 -g vmnfs1 vmnfs1

[istacey@nfs-server01 ~]$ sudo usermod -aG sftpusers,nfsrsync vmnfs1

[istacey@nfs-server01 ~]$ sudo mkdir /NFS/vmnfs1
[istacey@nfs-server01 ~]$ sudo mkdir /NFS/vmnfs1/home
[istacey@nfs-server01 ~]$ sudo mkdir /NFS/vmnfs1/home/voucher-management
[istacey@nfs-server01 ~]$ sudo chown vmnfs1:sftpusers /NFS/vmnfs1/home

Note, change permission for the users chrooted “home” directory only. It’s important to leave everything else with the default root permissions.

[istacey@nfs-server01 ~]$ find /NFS -type d -exec ls -ld {} \;
drwxr-xr-x. 3 root root 20 Jul 20 15:52 /NFS
drwxr-xr-x 3 root root 18 Jul 20 15:00 /NFS/vmnfs1
drwxr-xr-x 3 vmnfs1 sftpusers 52 Jul 20 15:16 /NFS/vmnfs1/home
drwxrwxrwx 2 vmnfs1 nfsrsync 59 Jul 20 15:31 /NFS/vmnfs1/home/voucher-management

Update ssh and restart the service:

[istacey@nfs-server01 ~]$ sudo vi /etc/ssh/sshd_config

[istacey@nfs-server01 ~]$ sudo cat  /etc/ssh/sshd_config | grep Subsys -A3
#Subsystem      sftp    /usr/libexec/openssh/sftp-server
Subsystem   sftp    internal-sftp -d /home
Match Group sftpusers
ChrootDirectory /NFS/%u
ForceCommand internal-sftp -d /home/voucher-management

[istacey@nfs-server01 ~]$ sudo  systemctl restart sshd

Note: the ForceCommand option drops the sftp user into a subdirectory

To test first check ssh, this should throw an error:

[istacey@nfs-server02 ~]$ ssh vmnfs1@nfs-server01
vmnfs1@nfs-server01's password:
Last login: Tue Jul 20 15:13:33 2021 from nfs-server02-om.ocs.a1.hr
/bin/bash: No such file or directory
Connection to nfs-server01 closed.
[istacey@nfs-server02 ~]$

OR: 

[istacey@nfs-server02 ~]$ ssh vmnfs1@nfs-server01
vmnfs1@nfs-server01's password:
This service allows sftp connections only.
Connection to nfs-server01 closed.
[istacey@nfs-server02 ~]$

The user can no longer connect via ssh. Let’s try sftp:

[istacey@nfs-server02 ~]$ sftp  vmnfs1@nfs-server01
vmnfs1@nfs-server01's password:
Connected to nfs-server01.
sftp> pwd
Remote working directory: /home/voucher-management
sftp> ls
testfile       testfile1      testfiledate
sftp> quit
[istacey@nfs-server02 ~]$

As required the user is dropped into the /home/voucher-management (/NFS/vmnfs1/home/voucher-management/ on the server).

Finally make sure a regular user can still log in via ssh without the chroot restrictions and we’re done with this part, successfully configuring the sftp server with a jailed chroot user.

Configure rsync

As we are only using local storage and not shared storage, we will synchronize the folders with rsync

On both servers I created a user account called nfsrsync, verified folder owership and permissions, generated and copied ssh keys.

[nfsrsync@nfs-server01 ~]$ ssh-keygen -t rsa
[nfsrsync@nfs-server01 .ssh]$ cp id_rsa.pub authorized_keys

[nfsrsync@nfs-server01 ~]$ ssh-copy-id nfs-server02
[nfsrsync@nfs-server01 .ssh]$ scp id_rsa* nfs-server02:~/.ssh/

Add cron job to run rsync in both directions with a push. I chose not to run rsync as a daemon for this solution

[nfsrsync@nfs-server01 ~]$ crontab -l
*/5 * * * * rsync -rt /NFS/vmnfs1/home/voucher-management/ nfsrsync@nfs-server02:/NFS/vmnfs1/home/voucher-management/

[nfsrsync@nfs-server02 ~]$ crontab -l
*/5 * * * * rsync -rt /NFS/vmnfs1/home/voucher-management/ nfsrsync@nfs-server01:/NFS/vmnfs1/home/voucher-management/
[nfsrsync@nfs-server02 ~]$

Configure NFS

On both servers:

[istacey@nfs-server01 ~]$ sudo vi /etc/exports
[istacey@nfs-server01 ~]$ cat /etc/exports
/NFS/vmnfs1/home/voucher-management     *(rw,no_root_squash)
[istacey@nfs-server01 ~]$ sudo systemctl start nfs-server

Verify with showmount and test mounting the share, from server 2:

[istacey@nfs-server02 ~]$ sudo mount nfs-server01:/NFS/vmnfs1/home/voucher-management  /mnt
[istacey@nfs-server02 ~]$ df -h /mnt
Filesystem                                          Size  Used Avail Use% Mounted on
nfs-server01:/NFS/vmnfs1/home/voucher-management  100G   33M  100G   1% /mnt
[istacey@nfs-server02 ~]$ mount | grep nfs4
nfs-server01:/NFS/vmnfs1/home/voucher-management on /mnt type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.12.6.112,local_lock=none,addr=10.12.6.111)
[istacey@nfs-server02 ~]$

[istacey@nfs-server02 ~]$ find /mnt
/mnt
/mnt/testfile
/mnt/testfile1
/mnt/testfiledate
[istacey@nfs-server02 ~]$

And we are done.

References

Keepalived: https://www.redhat.com/sysadmin/keepalived-basics

rsync: https://www.atlantic.net/vps-hosting/how-to-use-rsync-copy-sync-files-servers/

chroot jail: https://access.redhat.com/solutions/2399571 , ForceCommand: https://serverfault.com/questions/704869/forward-sftp-user-to-chroot-subdirectory-after-authentication

Categories
Cloud Linux

Kubernetes Home Lab (part 1)

Infrastructure:

Ideally I wanted to run a home Kubernetes cluster on three or more Raspberry PIs, but at the time of writing I only have one suitable PI 4 at home and stock appears to be in short supply. Instead I will use what I have, mixing and matching devices.

  • One HP Z200 Workstation with 8GB RAM, running Ubuntu 20.04 with KVM running 2 Ubuntu VMs that I’ll designate as worker nodes in the cluster.
  • 1 Raspberry PI4 Model B 2GB RAM running Ubuntu 20.04 that I’ll use as the Kubernetes Master / Control Plane node.
My makeshift home lab with Stormtrooper on patrol!

Install and Prepare Ubuntu 20.04 on the Z200 / Configure the KVM Hypervisor:

Install Ubuntu on the Z200 Workstation via a bootable USB stick.

Install cpu-checker and verify that the system can use KVM acceleration.

sudo apt install cpu-checker
sudo kvm-ok
The workstation to be used as my hypervisor

Install KVM Packages:

sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virt-manager virtinst

sudo systemctl status libvirtdsudo 

systemctl enable --now libvirtd

Authorize User and Verify the install:

sudo usermod -aG libvirt $USER
sudo usermod -aG kvm $USER

sudo virsh list --all

Configure Bridged Networking:

Bridged networking allows the virtual interfaces to connect to the outside network through the physical interface, making them appear as normal hosts to the rest of the network. https://help.ubuntu.com/community/KVM/Networking#Bridged_Networking

ip --brief a s
brctl show
nmcli con show
sudo nmtui
NetworkManager TUI

Verify with

ip --brief a s
brctl show
nmcli con show

Configure Private Virtual Switch:

Use virsh to create the private network:

istacey@ubuntu-z200-01:~$ vi /tmp/br0.xml
istacey@ubuntu-z200-01:~$ cat /tmp/br0.xml
<network> 
  <name>br0</name> 
  <forward mode="bridge"/> 
  <bridge name="br0" /> 
</network>
istacey@ubuntu-z200-01:~$ sudo virsh net-list --all
 Name      State    Autostart   Persistent
--------------------------------------------
 default   active   yes         yes

istacey@ubuntu-z200-01:~$ sudo virsh net-define /tmp/br0.xml 
Network br0 defined from /tmp/br0.xml

istacey@ubuntu-z200-01:~$ sudo virsh net-start br0
Network br0 started

istacey@ubuntu-z200-01:~$ sudo virsh net-autostart br0
Network br0 marked as autostarted

istacey@ubuntu-z200-01:~$ sudo virsh net-list --all
 Name      State    Autostart   Persistent
--------------------------------------------
 br0       active   yes         yes
 default   active   yes         yes

istacey@ubuntu-z200-01:~$

Enable incoming ssh:

sudo apt update 
sudo apt install openssh-server
sudo systemctl status ssh

Test KVM

To test KVM, I created a temporary VM via the Virtual Machine Manager GUI (virt-manager), connected to the br0 bridge and used ssh to connect.

Install Vagrant:

KVM is all that is required to create VMs, either manually through the virt-manager GUI or scripted via virt-install, ansible or other automation tool, but for this exercise I thought I’d try Vagrant. I plan to build and rebuild this lab frequently and Vagrant is a popular tool for quickly spinning up VMs. It is not something I’d previously played with, so I thought I’d check it out.

Download and install

Installed as per https://www.vagrantup.com/downloads.

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -

sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
  102  

sudo apt-get update && sudo apt-get install vagrant

vagrant --version

Enable Libvirt provider plugin

We need to install the libvirt provider plugin as Vagrant is only aware Hyper-V, Docker and Oracle Virtualbox by default as shown below.

Default Vagrant Providers

However I hit the following bug when trying to install:

istacey@ubuntu-z200-01:~/vagrant$ vagrant plugin install vagrant-libvirt
Installing the 'vagrant-libvirt' plugin. This can take a few minutes...
Building native extensions. This could take a while...
Vagrant failed to properly resolve required dependencies. These
errors can commonly be caused by misconfigured plugin installations
or transient network issues. The reported error is:

ERROR: Failed to build gem native extension.

....

common.c:27:10: fatal error: st.h: No such file or directory
   27 | #include <st.h>
      |          ^~~~~~
compilation terminated.
make: *** [Makefile:245: common.o] Error 1

make failed, exit code 2

Gem files will remain installed in /home/istacey/.vagrant.d/gems/3.0.1/gems/ruby-libvirt-0.7.1 for inspection.
Results logged to /home/istacey/.vagrant.d/gems/3.0.1/extensions/x86_64-linux/3.0.0/ruby-libvirt-0.7.1/gem_make.out

The bug is described here: https://github.com/hashicorp/vagrant/issues/12445#issuecomment-876254254

After applying the suggested hotfix, I was able to install the plugin and test successfully:

vagrant-libvirt plugin
First Vagrant VM
Vagrant VM and manually provisioned VM running

Create the Worker Node VMs

With KVM working and Vagrant configured we can create the VMs that will become worker nodes in the K8s cluster. Below is my Vagrantfile to spin up two VMs, I referred to https://github.com/vagrant-libvirt/vagrant-libvirt for options:

Vagrant.configure('2') do |config|
  config.vm.box = "generic/ubuntu2004"
  
  config.vm.define :k8swrk01 do |k8swrk01|
    k8swrk01.vm.hostname = "k8s-worker01"
    k8swrk01.vm.network :private_network, type: "dhcp",
      libvirt__network_name: "br0"
    k8swrk01.vm.provider :kvm do |kvm, override|
      kvm.memory_size     = '2048m'
      kvm.cpus            = '2'
    end
  end

  config.vm.define :k8swrk02 do |k8swrk02|
    k8swrk02.vm.hostname = "k8s-worker02"
    k8swrk02.vm.network :private_network, type: "dhcp",
      libvirt__network_name: "br0"
    k8swrk02.vm.provider :kvm do |kvm, override|
      kvm.memory_size     = '2048m'
      kvm.cpus            = '2'
    end
  end

end
Running vagrant up to start the two VMs
VMs running

Install Ubuntu on the Raspberry Pi

Following https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#2-prepare-the-sd-card

Configure nodes

Next configure the nodes, creating user accounts, copying ssh-keys, configuring sudoers, etc.

See part 2 for bootstrapping a new Kubernetes cluster:

Resources

Here are some articles I came across in my research or by complete accident….. …

Tool Homepages

Installation

. https://leftasexercise.com/2020/05/15/managing-kvm-virtual-machines-part-i-vagrant-and-libvirt/ https://www.taniarascia.com/what-are-vagrant-and-virtualbox-and-how-do-i-use-them/ . https://www.hebergementwebs.com/news/how-to-configure-a-kubernetes-cluster-on-ubuntu-20-04-18-04-16-04-in-14-steps . https://ostechnix.com/how-to-use-vagrant-with-libvirt-kvm-provider/

Categories
Linux

Install Docker CE on RHEL8

As per Red Hat documentation Docker is not supported in RHEL 8.

The Podman, Skopeo, and Buildah tools were developed to replace Docker command features. Each tool in this scenario is more lightweight and focused on a subset of features.

For my latest work project, however, where we will be deploying Kubernetes clusters with Rancher we need RHEL8 and Docker.

Rancher Support Matrix

Manual Install

Following https://linuxconfig.org/how-to-install-docker-in-rhel-8

Add and enable the docker-ce repo with dnf config-manager. Verify with repolist:

$ sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo

[ec2-user@ip-192-169-2-20 ~]$ sudo dnf repolist -v  | grep docker-ce-stable -A10
repo: using cache for: docker-ce-stable
docker-ce-stable: using metadata from Wed 02 Jun 2021 07:27:37 PM UTC.
....

Repo-id            : docker-ce-stable
Repo-name          : Docker CE Stable - x86_64
Repo-revision      : 1622662057
Repo-updated       : Wed 02 Jun 2021 07:27:37 PM UTC
Repo-pkgs          : 38
Repo-available-pkgs: 38
Repo-size          : 937 M
Repo-baseurl       : https://download.docker.com/linux/centos/8/x86_64/stable
Repo-expire        : 172,800 second(s) (last: Tue 15 Jun 2021 03:49:08 PM UTC)
Repo-filename      : /etc/yum.repos.d/docker-ce.repo

Display available versions and install with dnf and the –nobest flag:

[ec2-user@ip-192-169-2-20 ~]$ sudo dnf list docker-ce --showduplicates | sort -r
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Last metadata expiration check: 0:40:11 ago on Tue 15 Jun 2021 03:49:08 PM UTC.
docker-ce.x86_64                3:20.10.7-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.6-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.5-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.4-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.3-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.2-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.1-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.0-3.el8                 docker-ce-stable
docker-ce.x86_64                3:19.03.15-3.el8                docker-ce-stable
docker-ce.x86_64                3:19.03.14-3.el8                docker-ce-stable
docker-ce.x86_64                3:19.03.13-3.el8                docker-ce-stable
Available Packages

[ec2-user@ip-192-169-2-20 ~]$ sudo dnf install --nobest docker-ce
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Last metadata expiration check: 0:40:17 ago on Tue 15 Jun 2021 03:49:08 PM UTC.
Dependencies resolved.
===============================================================================================================================================
 Package                            Architecture Version                                           Repository                             Size
===============================================================================================================================================
Installing:
 docker-ce                          x86_64       3:20.10.7-3.el8                                   docker-ce-stable                       27 M
Installing dependencies:
 container-selinux                  noarch       2:2.162.0-1.module+el8.4.0+11311+9da8acfb         rhui-rhel-8-appstream-rhui-rpms        52 k
 containerd.io                      x86_64       1.4.6-3.1.el8                                     docker-ce-stable                       34 M
 docker-ce-cli                      x86_64       1:20.10.7-3.el8                                   docker-ce-stable                       33 M
 docker-ce-rootless-extras          x86_64       20.10.7-3.el8                                     docker-ce-stable                      9.2 M
 docker-scan-plugin                 x86_64       0.8.0-3.el8                                       docker-ce-stable                      4.2 M
 fuse-common                        x86_64       3.2.1-12.el8                                      rhui-rhel-8-baseos-rhui-rpms           21 k
 fuse-overlayfs                     x86_64       1.4.0-3.module+el8.4.0+11311+9da8acfb             rhui-rhel-8-appstream-rhui-rpms        72 k
 fuse3                              x86_64       3.2.1-12.el8                                      rhui-rhel-8-baseos-rhui-rpms           50 k
 fuse3-libs                         x86_64       3.2.1-12.el8                                      rhui-rhel-8-baseos-rhui-rpms           94 k
 iptables                           x86_64       1.8.4-10.el8                                      rhui-rhel-8-baseos-rhui-rpms          581 k
 libcgroup                          x86_64       0.41-19.el8                                       rhui-rhel-8-baseos-rhui-rpms           70 k
 libnetfilter_conntrack             x86_64       1.0.6-5.el8                                       rhui-rhel-8-baseos-rhui-rpms           65 k
 libnfnetlink                       x86_64       1.0.1-13.el8                                      rhui-rhel-8-baseos-rhui-rpms           33 k
 libnftnl                           x86_64       1.1.5-4.el8                                       rhui-rhel-8-baseos-rhui-rpms           83 k
 libslirp                           x86_64       4.3.1-1.module+el8.4.0+11311+9da8acfb             rhui-rhel-8-appstream-rhui-rpms        69 k
 policycoreutils-python-utils       noarch       2.9-9.el8                                         rhui-rhel-8-baseos-rhui-rpms          251 k
 slirp4netns                        x86_64       1.1.8-1.module+el8.4.0+11311+9da8acfb             rhui-rhel-8-appstream-rhui-rpms        51 k
Enabling module streams:
 container-tools                                 rhel8

Transaction Summary
===============================================================================================================================================
Install  18 Packages

Total download size: 108 M
Installed size: 441 M
Is this ok [y/N]: y
Downloading Packages:

firewalld is already disabled so we don’t need to disable it to address concerns about DNS resolution working inside Docker containers.

Add my user to the docker group and start/enable the docker daemon.

$ sudo usermod -aG docker ec2-user
$ sudo systemctl enable --now docker
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.

$ systemctl is-active docker
active
$ systemctl is-enabled docker
enabled
[ec2-user@ip-192-169-2-20 ~]$ cat /etc/redhat-release && docker --version
Red Hat Enterprise Linux release 8.4 (Ootpa)
Docker version 20.10.7, build f0df350

Test docker with hello-world.

Automated Install with Ansible

As I have a number of servers to repeat the installation on, I’ll use an ansible playbook.

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ ansible --version
ansible 2.9.10
  config file = None
  configured module search path = ['/home/ec2-user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 3.6.8 (default, Mar 18 2021, 08:58:41) [GCC 8.4.1 20200928 (Red Hat 8.4.1-1)]

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ cat /etc/hosts | grep Rancher  | awk '{print $2}' > inv
[ec2-user@ip-192-169-2-108 ansible-rhel8]$ vi inv
[ec2-user@ip-192-169-2-108 ansible-rhel8]$ cat inv
[rancher]
DevRHEL8-Rancher-01
DevRHEL8-Rancher-02
DevRHEL8-Rancher-03

In the playbook I’m also taking care of some Rancher prerequisites and other tasks

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ cat docker-rancher/tasks/main.yaml
---

- name: Upgrade all packages
  dnf:
    name: "*"
    state: latest
  tags: [update_packages]

- name: Install packages
  dnf:
    name:
      - psacct
      - git
      - yum-utils
      - device-mapper-persistent-data
      - lvm2
      - vim
    state: present
  tags: [dnf_installs]

- name: Enable docker-ce repo
  shell: dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
  tags: [docker_repo]

- name: Install docker
  dnf:
    name: docker-ce
    state: present
  tags: [docker_install]

- name: enable docker service
  systemd:
    name: docker
    state: restarted
    enabled: yes
    daemon_reload: yes
  tags: [docker_restart]

- name: Update sshd_config AllowAgentForwarding
  lineinfile:
    path: /etc/ssh/sshd_config
    regexp: '^#AllowAgentForwarding yes'
    line: 'AllowAgentForwarding yes'
  tags: [rancher-prereq]

- name: Update sshd_config AllowTcpForwarding
  lineinfile:
    path: /etc/ssh/sshd_config
    regexp: '^#AllowTcpForwarding yes'
    line: 'AllowTcpForwarding yes'
  tags: [rancher-prereq]

- name: Update sshd_config GatewayPorts
  lineinfile:
    path: /etc/ssh/sshd_config
    regexp: '^#GatewayPorts no'
    line: 'GatewayPorts yes'
  tags: [rancher-prereq]

- name: check bridge networking is allowed
  shell: modprobe br_netfilter
  tags: [bridge]

- name: check bridge networking is allowed bridge-nf-call-iptables
  shell: echo "1" > /proc/sys/net/bridge/bridge-nf-call-iptables
  tags: [bridge]

- name: Add Kubernetes repo
  yum_repository:
    name: kubernetes
    description: Kubernetes repo
    file: kubernetes
    baseurl: https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
    enabled: yes
    gpgcheck: 1
    repo_gpgcheck: 1
    gpgkey: https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
  tags: [k8srepo]

- name: Install packages
  dnf:
    name:
      - kubectl
    state: present
  tags: [kubectl_install]

- name: Update /etc/hosts
  copy:
    src: /etc/hosts
    dest: /etc/hosts
    mode: '0644'
  tags: [hosts_file]

Running the playbook:

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ ansible-playbook -i ./inv docker-rancher.yaml -b

PLAY [rancher] ********************************************************************************************************************************

TASK [Gathering Facts] ************************************************************************************************************************
ok: [DevRHEL8-Rancher-03]
ok: [DevRHEL8-Rancher-02]
ok: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Upgrade all packages] **************************************************************************************************
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Install packages] ******************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Enable docker-ce repo] *************************************************************************************************
[WARNING]: Consider using the dnf module rather than running 'dnf'.  If you need to use command because dnf is insufficient you can add 'warn:
false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message.
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Install docker] ********************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-02]

TASK [docker-rancher : enable docker service] *************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Update sshd_config AllowAgentForwarding] *******************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-02]

TASK [docker-rancher : Update sshd_config AllowTcpForwarding] *********************************************************************************
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]

TASK [docker-rancher : Update sshd_config GatewayPorts] ***************************************************************************************
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : check bridge networking is allowed] ************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : check bridge networking is allowed bridge-nf-call-iptables] ************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Add Kubernetes repo] ***************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Install packages] ******************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Update /etc/hosts] *****************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-02]

PLAY RECAP ************************************************************************************************************************************
DevRHEL8-Rancher-01        : ok=14   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
DevRHEL8-Rancher-02        : ok=14   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
DevRHEL8-Rancher-03        : ok=14   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Test as before with ‘docker run hello-world’ and verify docker version:

[ec2-user@ip-192-169-2-7 ~]$ cat /etc/redhat-release && docker --version
Red Hat Enterprise Linux release 8.4 (Ootpa)
Docker version 20.10.7, build f0df350

Scripted Install

Rancher provide a handy install script available at https://releases.rancher.com/install-docker/20.10.sh

[ec2-user@ip-192-169-2-250 ~]$ curl  https://releases.rancher.com/install-docker/20.10.sh | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17683  100 17683    0     0  46904      0 --:--:-- --:--:-- --:--:-- 46904
# Executing docker install script, commit: 7cae5f8b0decc17d6571f9f52eb840fbc13b2737
+ sudo -E sh -c 'yum install -y -q yum-utils'
+ sudo -E sh -c 'yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Adding repo from: https://download.docker.com/linux/centos/docker-ce.repo
+ '[' stable '!=' stable ']'
+ '[' rhel = rhel ']'
+ adjust_repo_releasever 8.2
+ DOWNLOAD_URL=https://download.docker.com
+ case $1 in
+ releasever=8
+ for channel in "stable" "test" "nightly"
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-stable.baseurl=https://download.docker.com/linux/centos/8/\$basearch/stable --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-stable-debuginfo.baseurl=https://download.docker.com/linux/centos/8/debug-\$basearch/stable --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-stable-source.baseurl=https://download.docker.com/linux/centos/8/source/stable --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ for channel in "stable" "test" "nightly"
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-test.baseurl=https://download.docker.com/linux/centos/8/\$basearch/test --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-test-debuginfo.baseurl=https://download.docker.com/linux/centos/8/debug-\$basearch/test --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-test-source.baseurl=https://download.docker.com/linux/centos/8/source/test --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ for channel in "stable" "test" "nightly"
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-nightly.baseurl=https://download.docker.com/linux/centos/8/\$basearch/nightly --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-nightly-debuginfo.baseurl=https://download.docker.com/linux/centos/8/debug-\$basearch/nightly --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-nightly-source.baseurl=https://download.docker.com/linux/centos/8/source/nightly --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ [[ 8.2 =~ 7\. ]]
+ '[' 8.2 == 7 ']'
+ sudo -E sh -c 'yum makecache'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Docker CE Stable - x86_64                                                                                      137 kB/s |  14 kB     00:00
Red Hat Update Infrastructure 3 Client Configuration Server 8                                                   35 kB/s | 2.1 kB     00:00
Red Hat Enterprise Linux 8 for x86_64 - AppStream from RHUI (RPMs)                                              22 kB/s | 2.8 kB     00:00
Red Hat Enterprise Linux 8 for x86_64 - BaseOS from RHUI (RPMs)                                                 24 kB/s | 2.4 kB     00:00
Metadata cache created.
INFO: Searching repository for VERSION '20.10.7'
INFO: yum list --showduplicates 'docker-ce' | grep '20.10.7.*el' | tail -1 | awk '{print $2}'
+ '[' -n 20.10.7-3.el8 ']'
+ sudo -E sh -c 'yum install -y -q docker-ce-cli-20.10.7-3.el8'
warning: /var/cache/dnf/docker-ce-stable-fa9dc42ab4cec2f4/packages/docker-ce-cli-20.10.7-3.el8.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 621e9f35: NOKEY
Importing GPG key 0x621E9F35:
 Userid     : "Docker Release (CE rpm) <docker@docker.com>"
 Fingerprint: 060A 61C5 1B55 8A7F 742B 77AA C52F EB6B 621E 9F35
 From       : https://download.docker.com/linux/centos/gpg

Installed:
  docker-ce-cli-1:20.10.7-3.el8.x86_64                                  docker-scan-plugin-0.8.0-3.el8.x86_64

+ sudo -E sh -c 'yum install -y -q docker-ce-20.10.7-3.el8'


Installed:
  container-selinux-2:2.162.0-1.module+el8.4.0+11311+9da8acfb.noarch        containerd.io-1.4.6-3.1.el8.x86_64
  docker-ce-3:20.10.7-3.el8.x86_64                                          docker-ce-rootless-extras-20.10.7-3.el8.x86_64
  fuse-common-3.2.1-12.el8.x86_64                                           fuse-overlayfs-1.4.0-3.module+el8.4.0+11311+9da8acfb.x86_64
  fuse3-3.2.1-12.el8.x86_64                                                 fuse3-libs-3.2.1-12.el8.x86_64
  iptables-1.8.4-10.el8.x86_64                                              libcgroup-0.41-19.el8.x86_64
  libnetfilter_conntrack-1.0.6-5.el8.x86_64                                 libnfnetlink-1.0.1-13.el8.x86_64
  libnftnl-1.1.5-4.el8.x86_64                                               libslirp-4.3.1-1.module+el8.4.0+11311+9da8acfb.x86_64
  policycoreutils-python-utils-2.9-9.el8.noarch                             slirp4netns-1.1.8-1.module+el8.4.0+11311+9da8acfb.x86_64

+ '[' -n 1 ']'
+ sudo -E sh -c 'yum install -y -q docker-ce-rootless-extras-20.10.7-3.el8'
+ command_exists iptables
+ command -v iptables
+ start_docker
+ '[' '!' -z ']'
+ '[' -d /run/systemd/system ']'
+ sudo -E sh -c 'systemctl start docker'
+ sudo -E sh -c 'docker version'
Client: Docker Engine - Community
 Version:           20.10.7
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        f0df350
 Built:             Wed Jun  2 11:56:24 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.7
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       b0f5bc3
  Built:            Wed Jun  2 11:54:48 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.6
  GitCommit:        d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc:
  Version:          1.0.0-rc95
  GitCommit:        b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

================================================================================

To run Docker as a non-privileged user, consider setting up the
Docker daemon in rootless mode for your user:

    dockerd-rootless-setuptool.sh install

Visit https://docs.docker.com/go/rootless/ to learn about rootless mode.


To run the Docker daemon as a fully privileged service, but granting non-root
users access, refer to https://docs.docker.com/go/daemon-access/

WARNING: Access to the remote API on a privileged Docker daemon is equivalent
         to root access on the host. Refer to the 'Docker daemon attack surface'
         documentation for details: https://docs.docker.com/go/attack-surface/

================================================================================

[ec2-user@ip-192-169-2-250 ~]$



https://releases.rancher.com/install-docker/20.10.sh

Verify as before:

[ec2-user@ip-192-169-2-250 ~]$ cat /etc/redhat-release && docker --version
Red Hat Enterprise Linux release 8.4 (Ootpa)
Docker version 20.10.7, build f0df350

I’ll use this script in later versions of the ansible playbook.

Wrapping Up

I didn’t expect the process of installing docker on RHEL8 to be so easy, I expected to hit dependency issues, but it seems with the later versions of Docker 20.10 many of the install issues are fixed https://medium.com/nttlabs/docker-20-10-59cc4bd59d37 .

It is still not an ideal situation, Red Hat are unlikely to help with any container related issues on opening a support case where we are running docker and not podman on RHEL8, but docker appears to be stable.