Categories
Cloud Linux

Deploy Web App in Azure Kubernetes Service (AKS)

Aim:

Publish website in new AKS (Azure Kubernetes Service) cluster, initially following the tutorial at https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-prepare-acr?tabs=azure-cli

Create the Azure Resources:

`This tutorial requires that you’re running the Azure CLI version 2.0.53 or later. Run az –version to find the version.

Use ‘az login’ to log into the Azure account:

az login

The CLI will open your default browser, and load an Azure sign-in page, I have two-factor authentication enabled and need my phone to confirm access.

Create a new Resource Group:

az group create --name AKS4BIM02 --location northeurope

Create an Azure Container Registry:

I will not use this initially, but I will create for future use.

az acr create --resource-group AKS4BIM02 --name acr4BIM --sku Basic

## List images in registry (None yet) 

az acr repository list --name acr4BIM --output table

Create the AKS cluster:

See options described at the following links:

istacey@DUB004043:~$ az aks create \
>     --resource-group AKS4BIM02 \
>     --name AKS4BIM02 \
>     --node-count 2 \
>     --generate-ssh-keys \
>     --zones 1 2

Connect to the cluster:

If necessary install the Kubernetes CLI (az aks install-cli) and then connect:

$ az aks get-credentials --resource-group AKS4BIM02 --name AKS4BIM02
Merged "AKS4BIM02" as current context in /home/istacey/.kube/config

$ kubectl get nodes -o wide
NAME                                STATUS   ROLES   AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
aks-nodepool1-88432225-vmss000000   Ready    agent   7m22s   v1.20.9   10.240.0.4    <none>        Ubuntu 18.04.6 LTS   5.4.0-1059-azure   containerd://1.4.9+azure
aks-nodepool1-88432225-vmss000001   Ready    agent   7m17s   v1.20.9   10.240.0.5    <none>        Ubuntu 18.04.6 LTS   5.4.0-1059-azure   containerd://1.4.9+azure

Create Storage Account and Azure File share

Following https://docs.microsoft.com/en-us/azure/aks/azure-files-volume

#Set variables:
AKS_PERS_STORAGE_ACCOUNT_NAME=istacestorageacct01
AKS_PERS_RESOURCE_GROUP=AKS4BIM02
AKS_PERS_LOCATION=northeurope
AKS_PERS_SHARE_NAME=aksshare4bim02

az storage account create -n $AKS_PERS_STORAGE_ACCOUNT_NAME -g $AKS_PERS_RESOURCE_GROUP -l $AKS_PERS_LOCATION --sku Standard_LRS

export AZURE_STORAGE_CONNECTION_STRING=$(az storage account show-connection-string -n $AKS_PERS_STORAGE_ACCOUNT_NAME -g $AKS_PERS_RESOURCE_GROUP -o tsv)

echo $AZURE_STORAGE_CONNECTION_STRING

az storage share create -n $AKS_PERS_SHARE_NAME --connection-string $AZURE_STORAGE_CONNECTION_STRING

STORAGE_KEY=$(az storage account keys list --resource-group $AKS_PERS_RESOURCE_GROUP --account-name $AKS_PERS_STORAGE_ACCOUNT_NAME --query "[0].value" -o tsv)

echo Storage account name: $AKS_PERS_STORAGE_ACCOUNT_NAME

echo $STORAGE_KEY

Create a Kubernetes secret:

kubectl create secret generic azure-secret --from-literal=azurestorageaccountname=$AKS_PERS_STORAGE_ACCOUNT_NAME --from-literal=azurestorageaccountkey=$STORAGE_KEY

Upload index.html and images folder to file share via Azure Portal

index.html based on https://www.w3schools.com/howto/howto_css_coming_soon.asp and related to one of my favorite subjects Luton Town FC! 🙂

Create the Kubernetes Deployment

Create new namespace:

kubectl create ns isnginx

Create new deployment yaml manifest file:

$ vi nginx-deployment.yaml

$ cat nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: isnginx-deployment
  namespace: isnginx
  labels:
    app: isnginx-deployment
spec:
  replicas: 2
  selector:
      matchLabels:
        app: isnginx-deployment
  template:
    metadata:
      labels:
        app: isnginx-deployment
    spec:
      containers:
      - name: isnginx-deployment
        image: nginx
        volumeMounts:
        - name: nginxtest01
          mountPath: /usr/share/nginx/html
      volumes:
      - name: nginxtest01
        azureFile:
          secretName: azure-secret
          shareName: aksshare4bim02
          readOnly: false

Create the deployment:

$ kubectl create -f nginx-deployment.yaml
deployment.apps/isnginx-deployment created

$ kubectl rollout status deployment isnginx-deployment -n isnginx
Waiting for deployment "isnginx-deployment" rollout to finish: 0 of 2 updated replicas are available...

$ kubectl -n isnginx describe pod isnginx-deployment-5ff78ff678-7dphq  | tail -5
  Type     Reason       Age                  From               Message
  ----     ------       ----                 ----               -------
  Normal   Scheduled    2m32s                default-scheduler  Successfully assigned isnginx/isnginx-deployment-5ff78ff678-7dphq to aks-nodepool1-88432225-vmss000000
  Warning  FailedMount  29s                  kubelet            Unable to attach or mount volumes: unmounted volumes=[nginxtest01], unattached volumes=[default-token-lh6xf nginxtest01]: timed out waiting for the condition
  Warning  FailedMount  24s (x9 over 2m32s)  kubelet            MountVolume.SetUp failed for volume "aksshare4bim02" : Couldn't get secret isnginx/azure-secret

Mount failing as secret created in the default namespace.

Create a secret in the isnginx namespace:

kubectl -n isnginx create secret generic azure-secret --from-literal=azurestorageaccountname=$AKS_PERS_STORAGE_ACCOUNT_NAME --from-literal=azurestorageaccountkey=$STORAGE_KEY
$ kubectl -n isnginx get secret
NAME                  TYPE                                  DATA   AGE
azure-secret          Opaque                                2      97m
default-token-lh6xf   kubernetes.io/service-account-token   3      104m

Remove the deployment and recreate:

$ kubectl delete -f nginx-deployment.yaml
deployment.apps "isnginx-deployment" deleted

$ kubectl create -f nginx-deployment.yaml
deployment.apps/isnginx-deployment created

$ kubectl rollout status deployment isnginx-deployment -n isnginx
Waiting for deployment "isnginx-deployment" rollout to finish: 0 of 2 updated replicas are available...
Waiting for deployment "isnginx-deployment" rollout to finish: 1 of 2 updated replicas are available...
deployment "isnginx-deployment" successfully rolled out
$ kubectl -n isnginx get all
NAME                                      READY   STATUS    RESTARTS   AGE
pod/isnginx-deployment-6b8d9db99c-2kj5l   1/1     Running   0          80s
pod/isnginx-deployment-6b8d9db99c-w65gp   1/1     Running   0          80s

NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/isnginx-deployment   2/2     2            2           80s

NAME                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/isnginx-deployment-6b8d9db99c   2         2         2       80s

Create service / Expose deployment

And get external IP:

$ kubectl -n isnginx expose deployment isnginx-deployment --port=80 --type=LoadBalancer
service/isnginx-deployment exposed

$ kubectl -n isnginx get svc
NAME                 TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)        AGE
isnginx-deployment   LoadBalancer   10.0.110.58   20.93.54.110   80:30451/TCP   95m

Test with curl and web browser:

$ curl http://20.93.54.110 | grep -i luton
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2139  100  2139    0     0  44562      0 --:--:-- --:--:-- --:--:-- 45510
    <p>Luton Town's New Stadium</p>

Create HTTPS ingress:

AIM: Create HTTPS ingress with signed certificate and redirect http requests to HTTPS. Following https://docs.microsoft.com/en-us/azure/aks/ingress-tls

Create Ingress controller:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.4/deploy/static/provider/cloud/deploy.yaml
$ kubectl -n ingress-nginx get all
NAME                                            READY   STATUS      RESTARTS   AGE
pod/ingress-nginx-admission-create-twnd7        0/1     Completed   0          8m3s
pod/ingress-nginx-admission-patch-vnsj4         0/1     Completed   1          8m3s
pod/ingress-nginx-controller-5d4b6f79c4-mknxc   1/1     Running     0          8m4s

NAME                                         TYPE           CLUSTER-IP   EXTERNAL-IP     PORT(S)                      AGE
service/ingress-nginx-controller             LoadBalancer   10.0.77.11   20.105.96.112   80:31078/TCP,443:31889/TCP   8m4s
service/ingress-nginx-controller-admission   ClusterIP      10.0.2.37    <none>          443/TCP                      8m4s

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ingress-nginx-controller   1/1     1            1           8m4s

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/ingress-nginx-controller-5d4b6f79c4   1         1         1       8m4s

NAME                                       COMPLETIONS   DURATION   AGE
job.batch/ingress-nginx-admission-create   1/1           1s         8m3s
job.batch/ingress-nginx-admission-patch    1/1           3s         8m3s

Add an A record to DNS zone

$ IP=20.105.96.112
$ DNSNAME="istac-aks-ingress"
$ PUBLICIPID=$(az network public-ip list --query "[?ipAddress!=null]|[?contains(ipAddress, '$IP')].[id]" --output tsv)
$ az network public-ip update --ids $PUBLICIPID --dns-name $DNSNAME

$ az network public-ip show --ids $PUBLICIPID --query "[dnsSettings.fqdn]" --output tsv
istac-aks-ingress.northeurope.cloudapp.azure.com

$ az network public-ip show --ids $PUBLICIPID --query "[dnsSettings.fqdn]" --output tsv | nslookup
Server:         89.101.160.4
Address:        89.101.160.4#53

Non-authoritative answer:
Name:   istac-aks-ingress.northeurope.cloudapp.azure.com
Address: 20.105.96.112

Install cert-manager with helm:

$ kubectl label namespace ingress-nginx cert-manager.io/disable-validation=true
namespace/ingress-nginx labeled

$ helm repo add jetstack https://charts.jetstack.io
"jetstack" has been added to your repositories

$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "ingress-nginx" chart repository
...Successfully got an update from the "jetstack" chart repository
Update Complete. ⎈Happy Helming!⎈
$ helm install cert-manager jetstack/cert-manager \
>   --namespace ingress-nginx \
> --set installCRDs=true \
> --set nodeSelector."kubernetes\.io/os"=linux

NAME: cert-manager
LAST DEPLOYED: Mon Oct 25 10:56:51 2021
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.5.4 has been deployed successfully!
$ kubectl -n ingress-nginx get all
NAME                                            READY   STATUS      RESTARTS   AGE
pod/cert-manager-88ddc7f8d-ltz9z                1/1     Running     0          158m
pod/cert-manager-cainjector-748dc889c5-kcdx7    1/1     Running     0          158m
pod/cert-manager-webhook-55dfcc5474-tbrz2       1/1     Running     0          158m
pod/ingress-nginx-admission-create-twnd7        0/1     Completed   0          3h45m
pod/ingress-nginx-admission-patch-vnsj4         0/1     Completed   1          3h45m
pod/ingress-nginx-controller-5d4b6f79c4-mknxc   1/1     Running     0          3h45m

NAME                                         TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                      AGE
service/cert-manager                         ClusterIP      10.0.7.50      <none>          9402/TCP                     158m
service/cert-manager-webhook                 ClusterIP      10.0.132.151   <none>          443/TCP                      158m
service/ingress-nginx-controller             LoadBalancer   10.0.77.11     20.105.96.112   80:31078/TCP,443:31889/TCP   3h45m
service/ingress-nginx-controller-admission   ClusterIP      10.0.2.37      <none>          443/TCP                      3h45m

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cert-manager               1/1     1            1           158m
deployment.apps/cert-manager-cainjector    1/1     1            1           158m
deployment.apps/cert-manager-webhook       1/1     1            1           158m
deployment.apps/ingress-nginx-controller   1/1     1            1           3h45m

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/cert-manager-88ddc7f8d                1         1         1       158m
replicaset.apps/cert-manager-cainjector-748dc889c5    1         1         1       158m
replicaset.apps/cert-manager-webhook-55dfcc5474       1         1         1       158m
replicaset.apps/ingress-nginx-controller-5d4b6f79c4   1         1         1       3h45m

NAME                                       COMPLETIONS   DURATION   AGE
job.batch/ingress-nginx-admission-create   1/1           1s         3h45m
job.batch/ingress-nginx-admission-patch    1/1           3s         3h45m

Create a CA cluster issuer:

$ vi cluster-issuer.yaml
$ cat cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: ian@ianstacey.net
    privateKeySecretRef:
      name: letsencrypt
    solvers:
    - http01:
        ingress:
          class: nginx
          podTemplate:
            spec:
              nodeSelector:
                "kubernetes.io/os": linux
$ kubectl apply -f cluster-issuer.yaml
clusterissuer.cert-manager.io/letsencrypt created

Create an ingress route:

$ cat http7-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: powercourt-ingress
  namespace: isnginx
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/use-regex: "true"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/use-regex: "true"
    nginx.ingress.kubernetes.io/rewrite-target: /$2
    cert-manager.io/cluster-issuer: letsencrypt
spec:
  tls:
  - hosts:
    - istac-aks-ingress.northeurope.cloudapp.azure.com
    secretName: tls-secret
  defaultBackend:
    service:
      name: isnginx-clusterip
      port:
        number: 80


$ kubectl apply -f http7-ingress.yaml
ingress.networking.k8s.io/powercourt-ingress configured

Check resources:

$ kubectl -n isnginx describe ingress powercourt-ingress
Name:             powercourt-ingress
Namespace:        isnginx
Address:          20.105.96.112
Default backend:  isnginx-clusterip:80 (10.244.0.7:80,10.244.1.4:80)
TLS:
  tls-secret terminates istac-aks-ingress.northeurope.cloudapp.azure.com
Rules:
  Host        Path  Backends
  ----        ----  --------
  *           *     isnginx-clusterip:80 (10.244.0.7:80,10.244.1.4:80)
Annotations:  cert-manager.io/cluster-issuer: letsencrypt
              kubernetes.io/ingress.class: nginx
              nginx.ingress.kubernetes.io/rewrite-target: /$2
              nginx.ingress.kubernetes.io/use-regex: true
Events:
  Type    Reason             Age                   From                      Message
  ----    ------             ----                  ----                      -------
  Normal  Sync               13m (x10 over 3h51m)  nginx-ingress-controller  Scheduled for sync
  Normal  UpdateCertificate  13m (x3 over 25m)     cert-manager              Successfully updated Certificate "tls-secret"


$ kubectl get  certificate tls-secret --namespace isnginx
NAME         READY   SECRET       AGE
tls-secret   True    tls-secret   144m

$ kubectl get  certificate tls-secret --namespace isnginx -o yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  creationTimestamp: "2021-10-25T10:17:02Z"
  generation: 4
  name: tls-secret
  namespace: isnginx
  ownerReferences:
  - apiVersion: networking.k8s.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: Ingress
    name: powercourt-ingress
    uid: e93a8939-1ed6-444b-b2db-b26aedf01dd8
  resourceVersion: "122626"
  uid: 79631ea4-541b-4251-9c7b-ad5294e6bbc0
spec:
  dnsNames:
  - istac-aks-ingress.northeurope.cloudapp.azure.com
  issuerRef:
    group: cert-manager.io
    kind: ClusterIssuer
    name: letsencrypt
  secretName: tls-secret
  usages:
  - digital signature
  - key encipherment
status:
  conditions:
  - lastTransitionTime: "2021-10-25T12:28:08Z"
    message: Certificate is up to date and has not expired
    observedGeneration: 4
    reason: Ready
    status: "True"
    type: Ready
  notAfter: "2022-01-23T11:28:06Z"
  notBefore: "2021-10-25T11:28:07Z"
  renewalTime: "2021-12-24T11:28:06Z"
  revision: 4

Test:

http requests to http://istac-aks-ingress.northeurope.cloudapp.azure.com are successfully redirected to https:

The finished architecture:

Categories
Cloud Linux

Kubernetes Home Lab (part 2)

Bootstrap the cluster with kubeadm:

Following on from part one, we will create our new Kubernetes cluster.

Our Kubernetes Cluster Topology:

A single master/control plane node and two worker nodes:

Deployment Steps

To deploy the cluster, we will first take care of the prerequisites, install docker and install kubeadm, following the Kubernetes documentation:

Order of deployment steps

Installing kubeadm, kubelet and kubectl

We will install these packages on all the machines:

  • kubeadm: the command to bootstrap the cluster.
  • kubelet: the component that runs on all of the machines in your cluster and does things like starting pods and containers.
  • kubectl: the command line util to talk to your cluster.

For specific versions:

sudo apt-get install -y kubelet=1.21.0-00 kubeadm=1.21.0-00 kubectl=1.21.0-00

Initialize / Create the Cluster:

We can then create our cluster with ‘kubeadm init’

Bootstrap – Attempt 1

My first attempt failed due to a cgroups_memory issue:

root@k8s-master01:~# kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.0.185
[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
[preflight] The system verification failed. Printing the output from the verification:
root@k8s-master01:~# docker info | head
Client:
Context:    default
Debug Mode: false
Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
Containers: 2
  Running: 0
root@k8s-master01:~# docker info | grep -i cgroup
Cgroup Driver: systemd
Cgroup Version: 1
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory TCP limit support
WARNING: No oom kill disable support

cgroups Fix

The fix is described here https://phabricator.wikimedia.org/T122734.

As there is no grub with Ubuntu on Raspberry Pi (https://unix.stackexchange.com/questions/475973/cant-find-etc-default-grub) I simply had to edit /boot/firmware/cmdline.txt and reboot.

root@k8s-master01:~# cat /boot/firmware/cmdline.txt
net.ifnames=0 dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc cgroup_enable=memory swapaccount=1

After reboot:

istacey@k8s-master01:~$ uptime

20:54:27 up 1 min,  1 user,  load average: 1.36, 0.52, 0.19
istacey@k8s-master01:~$ cat /proc/cmdline
coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 bcm2708_fb.fbwidth=0 bcm2708_fb.fbheight=0 bcm2708_fb.fbswap=1 smsc95xx.macaddr=DC:A6:32:02:F0:6E vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000  net.ifnames=0 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc cgroup_enable=memory swapaccount=1 quiet splash

Bootstrap – Attempt 2

Second attempt is successful:

root@k8s-master01:~# kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.0.185
[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: missing optional cgroups: hugetlb
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.185]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.0.185 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.0.185 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 28.511339 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master01 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-master01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: oq7hb9.vtmiw210ozvi2grh
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.185:6443 --token oq7hb9.vtmiw210ozvi2grh \
        --discovery-token-ca-cert-hash sha256:c87681fc7fec18f015f974e558d8436113019fefbf91123bb5c5190466b5854d
root@k8s-master01:~#

Install the pod network add-on (CNI)

We will use Weave for this cluster

https://kubernetes.io/docs/concepts/cluster-administration/networking/#how-to-implement-the-kubernetes-networking-model

https://www.weave.works/docs/net/latest/kubernetes/
https://www.weave.works/docs/net/latest/kubernetes/kube-addon/

istacey@k8s-master01:~$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created
istacey@k8s-master01:~$

Check Nodes are ready and pods running:

istacey@k8s-master01:~$ kubectl get nodes
NAME           STATUS   ROLES                  AGE   VERSION
k8s-master01   Ready    control-plane,master   13m   v1.22.2

istacey@k8s-master01:~$ kubectl get pods -A
NAMESPACE     NAME                                   READY   STATUS    RESTARTS      AGE
kube-system   coredns-78fcd69978-cnql6               1/1     Running   0             14m
kube-system   coredns-78fcd69978-k4bnk               1/1     Running   0             14m
kube-system   etcd-k8s-master01                      1/1     Running   0             14m
kube-system   kube-apiserver-k8s-master01            1/1     Running   0             14m
kube-system   kube-controller-manager-k8s-master01   1/1     Running   0             14m
kube-system   kube-proxy-lx8bj                       1/1     Running   0             14m
kube-system   kube-scheduler-k8s-master01            1/1     Running   0             14m
kube-system   weave-net-f7f7h                        2/2     Running   1 (99s ago)   2m2s

Join the two Worker Nodes:

To get the join token:

kubeadm token create --help
kubeadm token create --print-join-command

Check once the two nodes are joined

istacey@k8s-master01:~$ kubectl get nodes
NAME           STATUS   ROLES                  AGE    VERSION
k8s-master01   Ready    control-plane,master   18m    v1.22.2
k8s-worker01   Ready    <none>                 2m4s   v1.22.2
k8s-worker02   Ready    <none>                 51s    v1.22.2

istacey@k8s-master01:~$ kubectl get ds -n kube-system
NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-proxy   3         3         3       3            3           kubernetes.io/os=linux   18m
weave-net    3         3         3       3            3           <none>                   6m10s

Quick Test:

istacey@k8s-master01:~$ kubectl run nginx --image=nginx
pod/nginx created
istacey@k8s-master01:~$ kubectl get pods -o wide
NAME    READY   STATUS              RESTARTS   AGE   IP       NODE           NOMINATED NODE   READINESS GATES
nginx   0/1     ContainerCreating   0          23s   <none>   k8s-worker01   <none>           <none>
istacey@k8s-master01:~$ kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP          NODE           NOMINATED NODE   READINESS GATES
nginx   1/1     Running   0          27s   10.44.0.1   k8s-worker01   <none>           <none>

istacey@k8s-master01:~$ kubectl delete po nginx
pod "nginx" deleted

Part 1 here:

Categories
Cloud

AWS Free Tier Expired

Migrating to AWS Lightsail

As per the title my 12 months free tier usage for this AWS account has expired. As this website would cost me around $60 per month, I decided to migrate to AWS Lightsail, with prices starting at a more attractive $3.50 per month. Thankfully I had billing alerts configured to warn me on the increased spend.

The process to migrate was simple enough following the online instructions, first exporting my current site, deploying a new WordPress Bitnami image on Amazon Lightsail and then importing.

I had some configuration to do in terms of the site, enabling HTTPS, assigning a free static IP and updating Route 53.

Once I was happy with the migration, I took final backups from the old architecture including a final snapshot of the RDS database instance before deleting the resources.

Categories
Cloud Linux

Extend Ceph Storage for Kubernetes Cluster

Scenario:

4 worker nodes with 25GB raw disk used in a ceph block cluster. As we are running low on space, we will extend the raw disks to 50GB and update rook-ceph accordingly.

Ceph OSD Management

Ceph Object Storage Daemons (OSDs) are the heart and soul of the Ceph storage platform. Each OSD manages a local device and together they provide the distributed storage. Rook will automate creation and management of OSDs to hide the complexity based on the desired state in the CephCluster CR as much as possible. This guide will walk through some of the scenarios to configure OSDs where more configuration may be required.

OSD Health

The rook-ceph-tools pod provides a simple environment to run Ceph tools. The ceph commands mentioned in this document should be run from the toolbox.

Once the is created, connect to the pod to execute the ceph commands to analyze the health of the cluster, in particular the OSDs and placement groups (PGs). Some common commands to analyze OSDs include:

ceph status
ceph osd tree
ceph osd status
ceph osd df
ceph osd utilization

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

Status Before:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status                                                                        cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 13h)
    mgr: a(active, since 13h)
    osd: 4 osds: 4 up (since 13h), 4 in (since 13h)

  data:
    pools:   2 pools, 129 pgs
    objects: 5.12k objects, 19 GiB
    usage:   63 GiB used, 37 GiB / 100 GiB avail
    pgs:     129 active+clean

  io:
    client:   60 KiB/s wr, 0 op/s rd, 1 op/s wr


[istacey@master001 ~]$ kubectl --kubeconfig=/home/istacey/.kube/config-hr -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r -- ceph osd  status
ID  HOST        USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
 0  worker002  10.9G  14.0G      1      151k      0        0   exists,up
 1  worker003  10.9G  14.0G      0        0       0        0   exists,up
 2  worker004  11.1G  13.8G      0        0       0        0   exists,up
 3  worker001  9.98G  15.0G      0        0       0        0   exists,up


openet@worker001:~$ lsblk | grep sdb -A1
sdb                                                                                                     8:16   0   25G  0 disk
└─ceph--f067bb6e--522a--48c6--a2a8--8930d15dc02f-osd--block--dc871464--0a16--484a--8fa8--b723eec178f1 253:10   0   25G  0 lvm

Raw Disk Extended:

openet@worker001:~$ lsblk | grep sdb -A2 
sdb                                                             8:16   0   50G  0 disk
└─ceph--f067bb6e--522a--48c6--a2a8--8930d15dc02f-osd--block--dc871464--0a16--484a--8fa8--b723eec178f1
                                                              253:10   0   25G  0 lvm

Remove the OSDs (one at a time):

https://github.com/rook/rook/blob/master/Documentation/ceph-osd-mgmt.md#remove-an-osd

To remove an OSD due to a failed disk or other re-configuration, consider the following to ensure the health of the data through the removal process:

  • Confirm you will have enough space on your cluster after removing your OSDs to properly handle the deletion
  • Confirm the remaining OSDs and their placement groups (PGs) are healthy in order to handle the rebalancing of the data
  • Do not remove too many OSDs at once
  • Wait for rebalancing between removing multiple OSDs

If all the PGs are active+clean and there are no warnings about being low on space, this means the data is fully replicated and it is safe to proceed. If an OSD is failing, the PGs will not be perfectly clean and you will need to proceed anyway.

Scale down rook-ceph-operator and the OSD deployments:

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep opera
rook-ceph-operator                   1/1     1            1           77d

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
deployment.apps/rook-ceph-operator scaled

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep opera
rook-ceph-operator                   0/0     0            0           77d

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status | egrep 'health|osds|usage'
    health: HEALTH_OK
    osd: 4 osds: 4 up (since 13h), 4 in (since 13h)
    usage:   63 GiB used, 37 GiB / 100 GiB avail

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep osd
rook-ceph-osd-0                      1/1     1            1           38h
rook-ceph-osd-1                      1/1     1            1           77d
rook-ceph-osd-2                      1/1     1            1           77d
rook-ceph-osd-3                      1/1     1            1           77d

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-osd-0 --replicas=0
deployment.apps/rook-ceph-osd-0 scaled

[istacey@master001 ~]$ kubectl get deployment -n rook-ceph | grep osd
rook-ceph-osd-0                      0/0     0            0           38h
rook-ceph-osd-1                      1/1     1            1           77d
rook-ceph-osd-2                      1/1     1            1           77d
rook-ceph-osd-3                      1/1     1            1           77d

Down and out the OSD

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd down osd.0
osd.0 is already down.

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status | egrep 'health|osds|usage'
    health: HEALTH_WARN
            1 osds down
            1 host (1 osds) down
    osd: 4 osds: 3 up (since 101s), 4 in (since 13h); 1 remapped pgs
    usage:   63 GiB used, 37 GiB / 100 GiB avail

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.09760  root default
-9         0.02440      host worker001
 3    hdd  0.02440          osd.3           up   1.00000  1.00000
-3         0.02440      host worker002
 0    hdd  0.02440          osd.0         down   1.00000  1.00000
-5         0.02440      host worker003
 1    hdd  0.02440          osd.1           up   1.00000  1.00000
-7         0.02440      host worker004
 2    hdd  0.02440          osd.2           up   1.00000  1.00000

### Mark the OSD as out:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd out osd.0
marked out osd.0.

Wait for the data to finish backfilling to other OSDs.

ceph status will indicate the backfilling is done when all of the PGs are active+clean.

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status
  cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_WARN
            Degraded data redundancy: 3171/15372 objects degraded (20.628%), 80 pgs degraded, 80 pgs undersized

  services:
    mon: 3 daemons, quorum a,b,c (age 13h)
    mgr: a(active, since 13h)
    osd: 4 osds: 3 up (since 4m), 3 in (since 96s); 80 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 5.12k objects, 19 GiB
    usage:   50 GiB used, 25 GiB / 75 GiB avail
    pgs:     3171/15372 objects degraded (20.628%)
             78 active+undersized+degraded+remapped+backfill_wait
             49 active+clean
             2  active+undersized+degraded+remapped+backfilling

  io:
    client:   71 KiB/s wr, 0 op/s rd, 1 op/s wr
    recovery: 6.5 MiB/s, 1 objects/s

### backfilling is done:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status | egrep 'health|osds|usage'
    health: HEALTH_OK
    osd: 4 osds: 3 up (since 22m), 3 in (since 19m)
    usage:   62 GiB used, 13 GiB / 75 GiB avail

Remove the OSD from the Ceph cluster

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd purge osd.0 --yes-i-really-mean-it
purged osd.0

### Note osd.0 is on worker002:

[istacey@master001 ~]$ kubectl get pods -n rook-ceph -o wide | grep osd | grep -v prepare
rook-ceph-osd-1-6c468554f4-8btvj                      1/1     Running     3          26h   10.42.171.207    worker003   <none>           <none>
rook-ceph-osd-2-5f8ffcd5bb-p44d4                      1/1     Running     1          25h   10.42.64.205     worker004   <none>           <none>
rook-ceph-osd-3-5d8b989cb-4hf8h                       1/1     Running     5          27h   10.42.7.26       worker001   <none>           <none>

Zap the disk

https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md#zapping-devices

As root clean and Prepare the disk on the VM:

DISK="/dev/sdb"
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
rm -rf /dev/ceph-*
rm -rf /dev/mapper/ceph--*
partprobe $DISK


[root@worker002 ~]# lsblk

NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sdb                   8:16   0   50G  0 disk

Scale back up and let osd rejoin:

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
deployment.apps/rook-ceph-operator scaled

[istacey@master001 ~]$ kubectl -n rook-ceph scale deployment rook-ceph-osd-0 --replicas=1
deployment.apps/rook-ceph-osd-0 scaled

[istacey@master001 ~]$ kubectl -n rook-ceph get deployment | egrep 'rook-ceph-operator|rook-ceph-osd'
rook-ceph-operator                   1/1     1            1           77d
rook-ceph-osd-0                      1/1     1            1           39h
rook-ceph-osd-1                      1/1     1            1           77d
rook-ceph-osd-2                      1/1     1            1           77d
rook-ceph-osd-3                      1/1     1            1           77d

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.12199  root default
-9         0.02440      host worker001
 3    hdd  0.02440          osd.3           up   1.00000  1.00000
-3         0.04880      host worker002
 0    hdd  0.04880          osd.0           up   1.00000  1.00000
-5         0.02440      host worker003
 1    hdd  0.02440          osd.1           up   1.00000  1.00000
-7         0.02440      host worker004
 2    hdd  0.02440          osd.2           up   1.00000  1.00000

openet@worker002:~$ lsblk | grep sdb -A1
sdb                                                                                                     8:16   0   50G  0 disk
└─ceph--ea8115b7--5418--41b9--b4d3--d6e22526dbb1-osd--block--68cfcb49--f858--46f2--979f--dc266e4e6cf0 253:10   0   50G  0 lvm

Wait for rebalance…

Rebalancing done….

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status
  cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 14h)
    mgr: a(active, since 14h)
    osd: 4 osds: 4 up (since 32m), 4 in (since 32m)

  task status:

  data:
    pools:   2 pools, 129 pgs
    objects: 5.12k objects, 19 GiB
    usage:   63 GiB used, 62 GiB / 125 GiB avail
    pgs:     129 active+clean

  io:
    client:   73 KiB/s wr, 0 op/s rd, 2 op/s wr

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r -- ceph df
--- RAW STORAGE ---
CLASS  SIZE     AVAIL   USED    RAW USED  %RAW USED
hdd    125 GiB  62 GiB  59 GiB    63 GiB      50.14
TOTAL  125 GiB  62 GiB  59 GiB    63 GiB      50.14

--- POOLS ---
POOL                   ID  PGS  STORED  OBJECTS  USED    %USED  MAX AVAIL
device_health_metrics   1    1     0 B        0     0 B      0     15 GiB
replicapool             3  128  19 GiB    5.12k  58 GiB  55.89     15 GiB

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd status
ID  HOST        USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
 0  worker002  20.5G  29.4G      1      156k      0        0   exists,up
 1  worker003  13.2G  11.7G      0        0       0        0   exists,up
 2  worker004  14.3G  10.6G      0        0       0        0   exists,up
 3  worker001  14.5G  10.4G      0     4095       0        0   exists,up

Repeat for next 3 OSDs…

The operator ideally will automatically create the new OSD within a few minutes of adding the new device or updating the CR. If you don’t see a new OSD automatically created, restart the operator (by deleting the operator pod) to trigger the OSD creation.

Extra step after hitting an issue:

Pod in error and storage not available on node, edit with kubectl after scaling operations

### Edit with kubectl and remove node:

kubectl edit CephCluster rook-ceph -n rook-ceph 

    - deviceFilter: sdb
      name: worker001
      resources: {}

End result:

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph status
  cluster:
    id:     13c5138f-f2f6-46ea-8ee0-4966330ac081
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 3h)
    mgr: a(active, since 22h)
    osd: 4 osds: 4 up (since 94m), 4 in (since 94m)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 5.12k objects, 19 GiB
    usage:   63 GiB used, 137 GiB / 200 GiB avail
    pgs:     33 active+clean
 
  io:
    client:   49 KiB/s wr, 0 op/s rd, 1 op/s wr
 
[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd status
ID  HOST        USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
0  worker002  12.7G  37.2G      0        0       0        0   exists,up
1  worker003  16.2G  33.7G      1     24.7k      0        0   exists,up
2  worker004  15.2G  34.7G      0        0       0        0   exists,up
3  worker001  18.6G  31.3G      0      819       0        0   exists,up 

[istacey@master001 ~]$ kubectl -n rook-ceph exec -it rook-ceph-tools-5d9d5db5bc-npz4r  -- ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.09760  root default
-9         0.02440      host worker001
 3    hdd  0.02440          osd.3           up   1.00000  1.00000
-3         0.02440      host worker002
 0    hdd  0.02440          osd.0           up   1.00000  1.00000
-5         0.02440      host worker003
 1    hdd  0.02440          osd.1           up   1.00000  1.00000
-7         0.02440      host worker004
 2    hdd  0.02440          osd.2           up   1.00000  1.00000

References:

https://github.com/rook/rook/blob/master/Documentation/ceph-osd-mgmt.md#remove-an-osd

https://github.com/rook/rook/issues/2997

https://docs.ceph.com/en/mimic/rados/operations/add-or-rm-osds/

https://www.cloudops.com/blog/the-ultimate-rook-and-ceph-survival-guide/

Categories
Linux

Basic HA NFS Server with Keepalived

Aim

Create a simple NFS HA cluster on RHEL7 VMs with local storage as shown below. The VMs run as guests on a RHEL 8 server running KVM. Connections with be made from the local network and pods running in a Kubernetes cluster will mount as PersistentVolumes.

Also I will create a SFTP chroot jail for incoming client sftp connections.

Logical diagram

Prerequisites

Install and enable nfs, keepalived and rsync packages
sudo yum install -y nfs-utils keepalived rsync

sudo systemctl enable nfs-server
sudo systemctl enable keepalived

keepalived --version
Get IP info
[istacey@nfs-server01 ~]$ ip --brief a s
lo               UNKNOWN        127.0.0.1/8 ::1/128
eth0             UP             10.12.6.111/25 fe80::5054:ff:fe79:79b3/64
eth1             UP             192.168.112.111/24 fe80::5054:ff:fe06:8dc5/64
eth2             UP             10.12.8.103/28 fe80::5054:ff:fec6:428f/64

[istacey@nfs-server02 ~]$ ip --brief a s
lo               UNKNOWN        127.0.0.1/8 ::1/128
eth0             UP             10.12.6.112/25 fe80::5054:ff:fef5:765e/64
eth1             UP             192.168.112.112/24 fe80::5054:ff:fead:fa64/64
eth2             UP             10.12.8.104/28 fe80::5054:ff:fef5:13de/64

VIP DETAILS:
VIP – NSF nfsvip 10.12.8.102
NSF_01 nfs-server01 10.12.8.103
NSF_02 nfs-server02 10.12.8.104

Configure keepalived

Server 1

[istacey@nfs-server01 ~]$ cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

vrrp_instance VI_1 {
    state MASTER
    interface eth2
    virtual_router_id 51
    priority 255
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.12.8.102
    }
}

Server 2:

[istacey@nfs-server02 ~]$ cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

vrrp_instance VI_1 {
    state BACKUP
    interface eth2
    virtual_router_id 51
    priority 254
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.12.8.102
    }
}

Test with ping:


[istacey@nfs-server02 ~]$ ping 10.12.8.102
PING 10.12.8.102 (10.12.8.102) 56(84) bytes of data.
From 10.12.8.104 icmp_seq=1 Destination Host Unreachable
From 10.12.8.104 icmp_seq=2 Destination Host Unreachable
From 10.12.8.104 icmp_seq=3 Destination Host Unreachable
From 10.12.8.104 icmp_seq=4 Destination Host Unreachable
^C
--- 10.12.8.102 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 2999ms
pipe 4
[istacey@nfs-server02 ~]$


[istacey@nfs-server01 ~]$ sudo systemctl start  keepalived
[istacey@nfs-server02 ~]$ sudo systemctl start  keepalived

[istacey@nfs-server02 ~]$ ping 10.12.8.102
PING 10.12.8.102 (10.12.8.102) 56(84) bytes of data.
64 bytes from 10.12.8.102: icmp_seq=1 ttl=64 time=0.123 ms
64 bytes from 10.12.8.102: icmp_seq=2 ttl=64 time=0.116 ms
64 bytes from 10.12.8.102: icmp_seq=3 ttl=64 time=0.104 ms

Show VIP:

[istacey@nfs-server01 ~]$ ip --brief a s
lo               UNKNOWN        127.0.0.1/8 ::1/128
eth0             UP             10.12.6.111/25 fe80::5054:ff:fe79:79b3/64
eth1             UP             192.168.112.111/24 fe80::5054:ff:fe06:8dc5/64
eth2             UP             10.12.8.103/28 10.12.8.102/32 fe80::5054:ff:fec6:428f/64
[istacey@nfs-server01 ~]$

Create SFTP chroot jail

Create the sftpusers group and the user on both servers:

[istacey@nfs-server01 ~]$ sudo groupadd -g 15000 nfsrsync
[istacey@nfs-server01 ~]$ sudo groupadd -g 15001 vmnfs1
[istacey@nfs-server01 ~]$ sudo groupadd -g 15002 sftpusers

[istacey@nfs-server01 ~]$ sudo useradd -u 15000 -g nfsrsync nfsrsync
[istacey@nfs-server01 ~]$ sudo useradd -u 15001 -g vmnfs1 vmnfs1

[istacey@nfs-server01 ~]$ sudo usermod -aG sftpusers,nfsrsync vmnfs1

[istacey@nfs-server01 ~]$ sudo mkdir /NFS/vmnfs1
[istacey@nfs-server01 ~]$ sudo mkdir /NFS/vmnfs1/home
[istacey@nfs-server01 ~]$ sudo mkdir /NFS/vmnfs1/home/voucher-management
[istacey@nfs-server01 ~]$ sudo chown vmnfs1:sftpusers /NFS/vmnfs1/home

Note, change permission for the users chrooted “home” directory only. It’s important to leave everything else with the default root permissions.

[istacey@nfs-server01 ~]$ find /NFS -type d -exec ls -ld {} \;
drwxr-xr-x. 3 root root 20 Jul 20 15:52 /NFS
drwxr-xr-x 3 root root 18 Jul 20 15:00 /NFS/vmnfs1
drwxr-xr-x 3 vmnfs1 sftpusers 52 Jul 20 15:16 /NFS/vmnfs1/home
drwxrwxrwx 2 vmnfs1 nfsrsync 59 Jul 20 15:31 /NFS/vmnfs1/home/voucher-management

Update ssh and restart the service:

[istacey@nfs-server01 ~]$ sudo vi /etc/ssh/sshd_config

[istacey@nfs-server01 ~]$ sudo cat  /etc/ssh/sshd_config | grep Subsys -A3
#Subsystem      sftp    /usr/libexec/openssh/sftp-server
Subsystem   sftp    internal-sftp -d /home
Match Group sftpusers
ChrootDirectory /NFS/%u
ForceCommand internal-sftp -d /home/voucher-management

[istacey@nfs-server01 ~]$ sudo  systemctl restart sshd

Note: the ForceCommand option drops the sftp user into a subdirectory

To test first check ssh, this should throw an error:

[istacey@nfs-server02 ~]$ ssh vmnfs1@nfs-server01
vmnfs1@nfs-server01's password:
Last login: Tue Jul 20 15:13:33 2021 from nfs-server02-om.ocs.a1.hr
/bin/bash: No such file or directory
Connection to nfs-server01 closed.
[istacey@nfs-server02 ~]$

OR: 

[istacey@nfs-server02 ~]$ ssh vmnfs1@nfs-server01
vmnfs1@nfs-server01's password:
This service allows sftp connections only.
Connection to nfs-server01 closed.
[istacey@nfs-server02 ~]$

The user can no longer connect via ssh. Let’s try sftp:

[istacey@nfs-server02 ~]$ sftp  vmnfs1@nfs-server01
vmnfs1@nfs-server01's password:
Connected to nfs-server01.
sftp> pwd
Remote working directory: /home/voucher-management
sftp> ls
testfile       testfile1      testfiledate
sftp> quit
[istacey@nfs-server02 ~]$

As required the user is dropped into the /home/voucher-management (/NFS/vmnfs1/home/voucher-management/ on the server).

Finally make sure a regular user can still log in via ssh without the chroot restrictions and we’re done with this part, successfully configuring the sftp server with a jailed chroot user.

Configure rsync

As we are only using local storage and not shared storage, we will synchronize the folders with rsync

On both servers I created a user account called nfsrsync, verified folder owership and permissions, generated and copied ssh keys.

[nfsrsync@nfs-server01 ~]$ ssh-keygen -t rsa
[nfsrsync@nfs-server01 .ssh]$ cp id_rsa.pub authorized_keys

[nfsrsync@nfs-server01 ~]$ ssh-copy-id nfs-server02
[nfsrsync@nfs-server01 .ssh]$ scp id_rsa* nfs-server02:~/.ssh/

Add cron job to run rsync in both directions with a push. I chose not to run rsync as a daemon for this solution

[nfsrsync@nfs-server01 ~]$ crontab -l
*/5 * * * * rsync -rt /NFS/vmnfs1/home/voucher-management/ nfsrsync@nfs-server02:/NFS/vmnfs1/home/voucher-management/

[nfsrsync@nfs-server02 ~]$ crontab -l
*/5 * * * * rsync -rt /NFS/vmnfs1/home/voucher-management/ nfsrsync@nfs-server01:/NFS/vmnfs1/home/voucher-management/
[nfsrsync@nfs-server02 ~]$

Configure NFS

On both servers:

[istacey@nfs-server01 ~]$ sudo vi /etc/exports
[istacey@nfs-server01 ~]$ cat /etc/exports
/NFS/vmnfs1/home/voucher-management     *(rw,no_root_squash)
[istacey@nfs-server01 ~]$ sudo systemctl start nfs-server

Verify with showmount and test mounting the share, from server 2:

[istacey@nfs-server02 ~]$ sudo mount nfs-server01:/NFS/vmnfs1/home/voucher-management  /mnt
[istacey@nfs-server02 ~]$ df -h /mnt
Filesystem                                          Size  Used Avail Use% Mounted on
nfs-server01:/NFS/vmnfs1/home/voucher-management  100G   33M  100G   1% /mnt
[istacey@nfs-server02 ~]$ mount | grep nfs4
nfs-server01:/NFS/vmnfs1/home/voucher-management on /mnt type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.12.6.112,local_lock=none,addr=10.12.6.111)
[istacey@nfs-server02 ~]$

[istacey@nfs-server02 ~]$ find /mnt
/mnt
/mnt/testfile
/mnt/testfile1
/mnt/testfiledate
[istacey@nfs-server02 ~]$

And we are done.

References

Keepalived: https://www.redhat.com/sysadmin/keepalived-basics

rsync: https://www.atlantic.net/vps-hosting/how-to-use-rsync-copy-sync-files-servers/

chroot jail: https://access.redhat.com/solutions/2399571 , ForceCommand: https://serverfault.com/questions/704869/forward-sftp-user-to-chroot-subdirectory-after-authentication

Categories
Cloud

Updating my WordPress Environment on Amazon EC2

Decoupling my WordPress Architecture:

In a previous post I described the creation of this site in AWS. Now is the time to decouple the infrastructure and remove the reliance on the previously created EC2 instance.

By default WordPress is storing data in two different ways, often locally on the same VM/instances.

  • MySQL database: articles, comments, users and parts of the configuration are stored in a MySQL database, I’m already using an RDS managed MySQL database here to avail of the benefits that brings.
  • File system: media files uploaded are stored on the file system, in my case under /var/www/html/wp-content. This means if the EC2 instance is terminated that data is lost.

The Aim:

  • To create an ephemeral, stateless instance, outsourcing content to S3 and EFS.
  • Create an Auto Scaling group, with a launch template to scale in/out a fleet of instances, in my case, to stick to the free tier, I will set a maximum capacity of 1.
The new infrastructure

EFS: Elastic File System:

Amazon EFS provides scalable file storage for use with Amazon EC2 and I will use it for my wp-content folder. I also have some images stored in S3. Like S3 EFS has resiliency across Availability Zones. With Amazon EFS, you do pay for the resources that you use, but my footprint is very low and I do not expect any charges over a few cents.

To use EFS I:

  • Created a new EFS file system and mounted as /efs/wp-content
  • Copied the contents of /var/www/html/wp-content to the temporary mount
  • Unmounted the EFS and remounted in /var/www/html/wp-content, making the mount persistent by updating /etc/fstab
  • Check the website and the WordPress Update functionality.
My EFS File System

Auto Scaling:

An Auto Scaling group contains a collection of Amazon EC2 instances that are treated as a logical grouping for the purposes of automatic scaling and management. An Auto Scaling group also enables you to use Amazon EC2 Auto Scaling features such as health check replacements and scaling policies.

Here I:

  • Created a new AMI from my original EC2 instance
  • Created a Launch Template (LT) containing the configuration information to launch new instances, including passing specific launch commands in the user data section.
  • Tested new instances by updating the ELB targets
  • After successfully testing, I terminated my previous instances, created a new ASG and updated the ELB targets.
ASG
ASG Settings

Wrap Up:

My environment is much more resilient now with no dependency on a single EC2 instance, high-availability has been introduced at all levels, although to keep to the free tier, my RDS DB instance is not Multi-AZ. Next I’ll tear everything down and redeploy with CloudFormation.

Categories
Cloud Linux

Kubernetes Home Lab (part 1)

Infrastructure:

Ideally I wanted to run a home Kubernetes cluster on three or more Raspberry PIs, but at the time of writing I only have one suitable PI 4 at home and stock appears to be in short supply. Instead I will use what I have, mixing and matching devices.

  • One HP Z200 Workstation with 8GB RAM, running Ubuntu 20.04 with KVM running 2 Ubuntu VMs that I’ll designate as worker nodes in the cluster.
  • 1 Raspberry PI4 Model B 2GB RAM running Ubuntu 20.04 that I’ll use as the Kubernetes Master / Control Plane node.
My makeshift home lab with Stormtrooper on patrol!

Install and Prepare Ubuntu 20.04 on the Z200 / Configure the KVM Hypervisor:

Install Ubuntu on the Z200 Workstation via a bootable USB stick.

Install cpu-checker and verify that the system can use KVM acceleration.

sudo apt install cpu-checker
sudo kvm-ok
The workstation to be used as my hypervisor

Install KVM Packages:

sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virt-manager virtinst

sudo systemctl status libvirtdsudo 

systemctl enable --now libvirtd

Authorize User and Verify the install:

sudo usermod -aG libvirt $USER
sudo usermod -aG kvm $USER

sudo virsh list --all

Configure Bridged Networking:

Bridged networking allows the virtual interfaces to connect to the outside network through the physical interface, making them appear as normal hosts to the rest of the network. https://help.ubuntu.com/community/KVM/Networking#Bridged_Networking

ip --brief a s
brctl show
nmcli con show
sudo nmtui
NetworkManager TUI

Verify with

ip --brief a s
brctl show
nmcli con show

Configure Private Virtual Switch:

Use virsh to create the private network:

istacey@ubuntu-z200-01:~$ vi /tmp/br0.xml
istacey@ubuntu-z200-01:~$ cat /tmp/br0.xml
<network> 
  <name>br0</name> 
  <forward mode="bridge"/> 
  <bridge name="br0" /> 
</network>
istacey@ubuntu-z200-01:~$ sudo virsh net-list --all
 Name      State    Autostart   Persistent
--------------------------------------------
 default   active   yes         yes

istacey@ubuntu-z200-01:~$ sudo virsh net-define /tmp/br0.xml 
Network br0 defined from /tmp/br0.xml

istacey@ubuntu-z200-01:~$ sudo virsh net-start br0
Network br0 started

istacey@ubuntu-z200-01:~$ sudo virsh net-autostart br0
Network br0 marked as autostarted

istacey@ubuntu-z200-01:~$ sudo virsh net-list --all
 Name      State    Autostart   Persistent
--------------------------------------------
 br0       active   yes         yes
 default   active   yes         yes

istacey@ubuntu-z200-01:~$

Enable incoming ssh:

sudo apt update 
sudo apt install openssh-server
sudo systemctl status ssh

Test KVM

To test KVM, I created a temporary VM via the Virtual Machine Manager GUI (virt-manager), connected to the br0 bridge and used ssh to connect.

Install Vagrant:

KVM is all that is required to create VMs, either manually through the virt-manager GUI or scripted via virt-install, ansible or other automation tool, but for this exercise I thought I’d try Vagrant. I plan to build and rebuild this lab frequently and Vagrant is a popular tool for quickly spinning up VMs. It is not something I’d previously played with, so I thought I’d check it out.

Download and install

Installed as per https://www.vagrantup.com/downloads.

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -

sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
  102  

sudo apt-get update && sudo apt-get install vagrant

vagrant --version

Enable Libvirt provider plugin

We need to install the libvirt provider plugin as Vagrant is only aware Hyper-V, Docker and Oracle Virtualbox by default as shown below.

Default Vagrant Providers

However I hit the following bug when trying to install:

istacey@ubuntu-z200-01:~/vagrant$ vagrant plugin install vagrant-libvirt
Installing the 'vagrant-libvirt' plugin. This can take a few minutes...
Building native extensions. This could take a while...
Vagrant failed to properly resolve required dependencies. These
errors can commonly be caused by misconfigured plugin installations
or transient network issues. The reported error is:

ERROR: Failed to build gem native extension.

....

common.c:27:10: fatal error: st.h: No such file or directory
   27 | #include <st.h>
      |          ^~~~~~
compilation terminated.
make: *** [Makefile:245: common.o] Error 1

make failed, exit code 2

Gem files will remain installed in /home/istacey/.vagrant.d/gems/3.0.1/gems/ruby-libvirt-0.7.1 for inspection.
Results logged to /home/istacey/.vagrant.d/gems/3.0.1/extensions/x86_64-linux/3.0.0/ruby-libvirt-0.7.1/gem_make.out

The bug is described here: https://github.com/hashicorp/vagrant/issues/12445#issuecomment-876254254

After applying the suggested hotfix, I was able to install the plugin and test successfully:

vagrant-libvirt plugin
First Vagrant VM
Vagrant VM and manually provisioned VM running

Create the Worker Node VMs

With KVM working and Vagrant configured we can create the VMs that will become worker nodes in the K8s cluster. Below is my Vagrantfile to spin up two VMs, I referred to https://github.com/vagrant-libvirt/vagrant-libvirt for options:

Vagrant.configure('2') do |config|
  config.vm.box = "generic/ubuntu2004"
  
  config.vm.define :k8swrk01 do |k8swrk01|
    k8swrk01.vm.hostname = "k8s-worker01"
    k8swrk01.vm.network :private_network, type: "dhcp",
      libvirt__network_name: "br0"
    k8swrk01.vm.provider :kvm do |kvm, override|
      kvm.memory_size     = '2048m'
      kvm.cpus            = '2'
    end
  end

  config.vm.define :k8swrk02 do |k8swrk02|
    k8swrk02.vm.hostname = "k8s-worker02"
    k8swrk02.vm.network :private_network, type: "dhcp",
      libvirt__network_name: "br0"
    k8swrk02.vm.provider :kvm do |kvm, override|
      kvm.memory_size     = '2048m'
      kvm.cpus            = '2'
    end
  end

end
Running vagrant up to start the two VMs
VMs running

Install Ubuntu on the Raspberry Pi

Following https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#2-prepare-the-sd-card

Configure nodes

Next configure the nodes, creating user accounts, copying ssh-keys, configuring sudoers, etc.

See part 2 for bootstrapping a new Kubernetes cluster:

Resources

Here are some articles I came across in my research or by complete accident….. …

Tool Homepages

Installation

. https://leftasexercise.com/2020/05/15/managing-kvm-virtual-machines-part-i-vagrant-and-libvirt/ https://www.taniarascia.com/what-are-vagrant-and-virtualbox-and-how-do-i-use-them/ . https://www.hebergementwebs.com/news/how-to-configure-a-kubernetes-cluster-on-ubuntu-20-04-18-04-16-04-in-14-steps . https://ostechnix.com/how-to-use-vagrant-with-libvirt-kvm-provider/

Categories
Cloud

The new AWS SysOps Certification exam (SOA-C02)

On March 26th of this year, I sat the new AWS Certified SysOps Administrator – Associate SOA-C02 Beta exam, the last day of the Beta period. After what felt like a very long, but not unexpected, wait for the results, I received an email from Credly yesterday informing me I have a new badge for AWS SysOps Administrator. Just to make sure, I logged into my Certmetrics and confirmed the pass! Happy days!

Previous AWS Exams

This was my third AWS certification after passing two exams last year, the Fundamental AWS Certified Cloud Practitioner followed by the AWS Certified Solutions Architect – Associate (SAA-C02).

Whilst the merits of the Cloud Practitioner is questionable, I felt it served as a good warm up for two reasons:

  • A refresher for someone like me who hadn’t sat a professional exam for some time.
  • A first experience of an online proctored exam, given that test centers were closed due to COVID-19.

Happy I passed and felt confident to move onto tackling the Solutions Architect cert. First, however I successfully sat Microsoft Azure Fundamentals , for which I had a free exam voucher, the content of this exam shared some of the basic principles of cloud computing with Cloud Practitioner and along with previous experience with Azure, this stood me in good stead.

I had started studying for the Architect exam back in January 2020 with Adrians Cantrill’s course on Linux Academy, but work got in the way and by the time I got to focus on the Architect certification again AWS had released the new SAA-C02 version of the exam with new topics added. I opted to study Adrian’s new, comprehensive course at learn.cantrill.io to continue my studies. Thankfully I got my pass and the studies have proved useful as the portion of my job working with AWS grows as more of our Teleco customers adopt cloud services.

An example of the image used in Adrian Cantrill’s courses

SysOps Study

Anyway, back to the subject of this post, the SysOps Associate beta exam…I decided to give the beta a go after seeing posts regarding the new exam on various AWS channels, Reddit, Twitter and Adrian’s Tech Study Slack. Being a beta exam the exam fee was at a reduced rate and the only downslide I saw was having to wait around 90 days for the result as opposed to the instant result on a non-beta exam!

To study this time I opted for Stephane Maarak’s excellent Udemy course. I didn’t quite have the time for Adrians Cantrill’s SysOps course, so I would rely on Stephane’s course, the overlap from Adrian’s Solution Architect course and my growing experience with AWS.

The exam

The exam is split into two parts, the first involving the familar multiple choice and multi-answer questions section and the second part contained a hands-on lab portion. With a total time of 3 hours and 45 minutes to complete all sections, plenty of time is provided and being beta it allows for any issues experienced. On finishing the first portion of 55 questions, I felt ok, but not super confident as some subjects for me at the time were not so familiar.

Sticking to the NDA and without giving anything away, some of the subjects I was not sure about were:

  • Amazon Elasticsearch Architecture
  • CloudFormation cross-stack references vs nested-stack
  • Plus I got two questions on Route 53 around APEX vs NONAPEX DNS Record types, for which I know I opted for the wrong option!

So going into the 3 labs, I felt I’d need to score well to secure a passing mark (720 out of 1000). I had plenty of time remaining for the labs, much longer than the 20 minutes recommended for each, so I took my time and didn’t rush through them, thoroughly checking my work after completing the tasks.

My experience of the labs were good, I believe other people sitting the exam had issues, but I had no such concerns. The labs were setup through a virtual Windows machine with browser access to the AWS console (there is no internet access so you are not able to look at AWS documentation to help implement tasks). On the right hand side of the screen, you are given a scenario, asked to create A B C, or implement X Y Z, using names provided and make sure it works as specified. You follow the requirements, implement the task using the console and complete each lab before moving onto the next (you cannot return to a previously completed lab).

Again sticking to the NDA my three tasks were around:

  • The setup of a scalable application – configuring a VPC, LT, ASG, ALB, Security, Networking etc
  • Using AWS config to ensure something is set (compliance)
  • Configure some S3 data buckets with lifecycle policies

On completing each lab, I felt confident that I’d fully implemented the tasks correctly.

Wrapping Up

I enjoyed the exam, I liked the hands-on section and I expect that element will be added to future AWS exams going forward. I didn’t like the long wait for the results, however, but I knew that beforehand!

The old SOA-C01 SysOps exam is retired from 26th of July. If you are looking to take the new SOA-C02 exam you can find more information on the Coming Soon to AWS Certification page with the exam guide here.

I will look to complete the AWS Certified Developer next to round off all three associate certifications, first however, I want to clear my Certified Kubernetes Administrator (CKA)

Categories
Linux

Install Docker CE on RHEL8

As per Red Hat documentation Docker is not supported in RHEL 8.

The Podman, Skopeo, and Buildah tools were developed to replace Docker command features. Each tool in this scenario is more lightweight and focused on a subset of features.

For my latest work project, however, where we will be deploying Kubernetes clusters with Rancher we need RHEL8 and Docker.

Rancher Support Matrix

Manual Install

Following https://linuxconfig.org/how-to-install-docker-in-rhel-8

Add and enable the docker-ce repo with dnf config-manager. Verify with repolist:

$ sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo

[ec2-user@ip-192-169-2-20 ~]$ sudo dnf repolist -v  | grep docker-ce-stable -A10
repo: using cache for: docker-ce-stable
docker-ce-stable: using metadata from Wed 02 Jun 2021 07:27:37 PM UTC.
....

Repo-id            : docker-ce-stable
Repo-name          : Docker CE Stable - x86_64
Repo-revision      : 1622662057
Repo-updated       : Wed 02 Jun 2021 07:27:37 PM UTC
Repo-pkgs          : 38
Repo-available-pkgs: 38
Repo-size          : 937 M
Repo-baseurl       : https://download.docker.com/linux/centos/8/x86_64/stable
Repo-expire        : 172,800 second(s) (last: Tue 15 Jun 2021 03:49:08 PM UTC)
Repo-filename      : /etc/yum.repos.d/docker-ce.repo

Display available versions and install with dnf and the –nobest flag:

[ec2-user@ip-192-169-2-20 ~]$ sudo dnf list docker-ce --showduplicates | sort -r
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Last metadata expiration check: 0:40:11 ago on Tue 15 Jun 2021 03:49:08 PM UTC.
docker-ce.x86_64                3:20.10.7-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.6-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.5-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.4-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.3-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.2-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.1-3.el8                 docker-ce-stable
docker-ce.x86_64                3:20.10.0-3.el8                 docker-ce-stable
docker-ce.x86_64                3:19.03.15-3.el8                docker-ce-stable
docker-ce.x86_64                3:19.03.14-3.el8                docker-ce-stable
docker-ce.x86_64                3:19.03.13-3.el8                docker-ce-stable
Available Packages

[ec2-user@ip-192-169-2-20 ~]$ sudo dnf install --nobest docker-ce
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Last metadata expiration check: 0:40:17 ago on Tue 15 Jun 2021 03:49:08 PM UTC.
Dependencies resolved.
===============================================================================================================================================
 Package                            Architecture Version                                           Repository                             Size
===============================================================================================================================================
Installing:
 docker-ce                          x86_64       3:20.10.7-3.el8                                   docker-ce-stable                       27 M
Installing dependencies:
 container-selinux                  noarch       2:2.162.0-1.module+el8.4.0+11311+9da8acfb         rhui-rhel-8-appstream-rhui-rpms        52 k
 containerd.io                      x86_64       1.4.6-3.1.el8                                     docker-ce-stable                       34 M
 docker-ce-cli                      x86_64       1:20.10.7-3.el8                                   docker-ce-stable                       33 M
 docker-ce-rootless-extras          x86_64       20.10.7-3.el8                                     docker-ce-stable                      9.2 M
 docker-scan-plugin                 x86_64       0.8.0-3.el8                                       docker-ce-stable                      4.2 M
 fuse-common                        x86_64       3.2.1-12.el8                                      rhui-rhel-8-baseos-rhui-rpms           21 k
 fuse-overlayfs                     x86_64       1.4.0-3.module+el8.4.0+11311+9da8acfb             rhui-rhel-8-appstream-rhui-rpms        72 k
 fuse3                              x86_64       3.2.1-12.el8                                      rhui-rhel-8-baseos-rhui-rpms           50 k
 fuse3-libs                         x86_64       3.2.1-12.el8                                      rhui-rhel-8-baseos-rhui-rpms           94 k
 iptables                           x86_64       1.8.4-10.el8                                      rhui-rhel-8-baseos-rhui-rpms          581 k
 libcgroup                          x86_64       0.41-19.el8                                       rhui-rhel-8-baseos-rhui-rpms           70 k
 libnetfilter_conntrack             x86_64       1.0.6-5.el8                                       rhui-rhel-8-baseos-rhui-rpms           65 k
 libnfnetlink                       x86_64       1.0.1-13.el8                                      rhui-rhel-8-baseos-rhui-rpms           33 k
 libnftnl                           x86_64       1.1.5-4.el8                                       rhui-rhel-8-baseos-rhui-rpms           83 k
 libslirp                           x86_64       4.3.1-1.module+el8.4.0+11311+9da8acfb             rhui-rhel-8-appstream-rhui-rpms        69 k
 policycoreutils-python-utils       noarch       2.9-9.el8                                         rhui-rhel-8-baseos-rhui-rpms          251 k
 slirp4netns                        x86_64       1.1.8-1.module+el8.4.0+11311+9da8acfb             rhui-rhel-8-appstream-rhui-rpms        51 k
Enabling module streams:
 container-tools                                 rhel8

Transaction Summary
===============================================================================================================================================
Install  18 Packages

Total download size: 108 M
Installed size: 441 M
Is this ok [y/N]: y
Downloading Packages:

firewalld is already disabled so we don’t need to disable it to address concerns about DNS resolution working inside Docker containers.

Add my user to the docker group and start/enable the docker daemon.

$ sudo usermod -aG docker ec2-user
$ sudo systemctl enable --now docker
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.

$ systemctl is-active docker
active
$ systemctl is-enabled docker
enabled
[ec2-user@ip-192-169-2-20 ~]$ cat /etc/redhat-release && docker --version
Red Hat Enterprise Linux release 8.4 (Ootpa)
Docker version 20.10.7, build f0df350

Test docker with hello-world.

Automated Install with Ansible

As I have a number of servers to repeat the installation on, I’ll use an ansible playbook.

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ ansible --version
ansible 2.9.10
  config file = None
  configured module search path = ['/home/ec2-user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 3.6.8 (default, Mar 18 2021, 08:58:41) [GCC 8.4.1 20200928 (Red Hat 8.4.1-1)]

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ cat /etc/hosts | grep Rancher  | awk '{print $2}' > inv
[ec2-user@ip-192-169-2-108 ansible-rhel8]$ vi inv
[ec2-user@ip-192-169-2-108 ansible-rhel8]$ cat inv
[rancher]
DevRHEL8-Rancher-01
DevRHEL8-Rancher-02
DevRHEL8-Rancher-03

In the playbook I’m also taking care of some Rancher prerequisites and other tasks

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ cat docker-rancher/tasks/main.yaml
---

- name: Upgrade all packages
  dnf:
    name: "*"
    state: latest
  tags: [update_packages]

- name: Install packages
  dnf:
    name:
      - psacct
      - git
      - yum-utils
      - device-mapper-persistent-data
      - lvm2
      - vim
    state: present
  tags: [dnf_installs]

- name: Enable docker-ce repo
  shell: dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
  tags: [docker_repo]

- name: Install docker
  dnf:
    name: docker-ce
    state: present
  tags: [docker_install]

- name: enable docker service
  systemd:
    name: docker
    state: restarted
    enabled: yes
    daemon_reload: yes
  tags: [docker_restart]

- name: Update sshd_config AllowAgentForwarding
  lineinfile:
    path: /etc/ssh/sshd_config
    regexp: '^#AllowAgentForwarding yes'
    line: 'AllowAgentForwarding yes'
  tags: [rancher-prereq]

- name: Update sshd_config AllowTcpForwarding
  lineinfile:
    path: /etc/ssh/sshd_config
    regexp: '^#AllowTcpForwarding yes'
    line: 'AllowTcpForwarding yes'
  tags: [rancher-prereq]

- name: Update sshd_config GatewayPorts
  lineinfile:
    path: /etc/ssh/sshd_config
    regexp: '^#GatewayPorts no'
    line: 'GatewayPorts yes'
  tags: [rancher-prereq]

- name: check bridge networking is allowed
  shell: modprobe br_netfilter
  tags: [bridge]

- name: check bridge networking is allowed bridge-nf-call-iptables
  shell: echo "1" > /proc/sys/net/bridge/bridge-nf-call-iptables
  tags: [bridge]

- name: Add Kubernetes repo
  yum_repository:
    name: kubernetes
    description: Kubernetes repo
    file: kubernetes
    baseurl: https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
    enabled: yes
    gpgcheck: 1
    repo_gpgcheck: 1
    gpgkey: https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
  tags: [k8srepo]

- name: Install packages
  dnf:
    name:
      - kubectl
    state: present
  tags: [kubectl_install]

- name: Update /etc/hosts
  copy:
    src: /etc/hosts
    dest: /etc/hosts
    mode: '0644'
  tags: [hosts_file]

Running the playbook:

[ec2-user@ip-192-169-2-108 ansible-rhel8]$ ansible-playbook -i ./inv docker-rancher.yaml -b

PLAY [rancher] ********************************************************************************************************************************

TASK [Gathering Facts] ************************************************************************************************************************
ok: [DevRHEL8-Rancher-03]
ok: [DevRHEL8-Rancher-02]
ok: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Upgrade all packages] **************************************************************************************************
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Install packages] ******************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Enable docker-ce repo] *************************************************************************************************
[WARNING]: Consider using the dnf module rather than running 'dnf'.  If you need to use command because dnf is insufficient you can add 'warn:
false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message.
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Install docker] ********************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-02]

TASK [docker-rancher : enable docker service] *************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Update sshd_config AllowAgentForwarding] *******************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-02]

TASK [docker-rancher : Update sshd_config AllowTcpForwarding] *********************************************************************************
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]

TASK [docker-rancher : Update sshd_config GatewayPorts] ***************************************************************************************
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : check bridge networking is allowed] ************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : check bridge networking is allowed bridge-nf-call-iptables] ************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Add Kubernetes repo] ***************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Install packages] ******************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-02]
changed: [DevRHEL8-Rancher-01]

TASK [docker-rancher : Update /etc/hosts] *****************************************************************************************************
changed: [DevRHEL8-Rancher-03]
changed: [DevRHEL8-Rancher-01]
changed: [DevRHEL8-Rancher-02]

PLAY RECAP ************************************************************************************************************************************
DevRHEL8-Rancher-01        : ok=14   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
DevRHEL8-Rancher-02        : ok=14   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
DevRHEL8-Rancher-03        : ok=14   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Test as before with ‘docker run hello-world’ and verify docker version:

[ec2-user@ip-192-169-2-7 ~]$ cat /etc/redhat-release && docker --version
Red Hat Enterprise Linux release 8.4 (Ootpa)
Docker version 20.10.7, build f0df350

Scripted Install

Rancher provide a handy install script available at https://releases.rancher.com/install-docker/20.10.sh

[ec2-user@ip-192-169-2-250 ~]$ curl  https://releases.rancher.com/install-docker/20.10.sh | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17683  100 17683    0     0  46904      0 --:--:-- --:--:-- --:--:-- 46904
# Executing docker install script, commit: 7cae5f8b0decc17d6571f9f52eb840fbc13b2737
+ sudo -E sh -c 'yum install -y -q yum-utils'
+ sudo -E sh -c 'yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Adding repo from: https://download.docker.com/linux/centos/docker-ce.repo
+ '[' stable '!=' stable ']'
+ '[' rhel = rhel ']'
+ adjust_repo_releasever 8.2
+ DOWNLOAD_URL=https://download.docker.com
+ case $1 in
+ releasever=8
+ for channel in "stable" "test" "nightly"
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-stable.baseurl=https://download.docker.com/linux/centos/8/\$basearch/stable --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-stable-debuginfo.baseurl=https://download.docker.com/linux/centos/8/debug-\$basearch/stable --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-stable-source.baseurl=https://download.docker.com/linux/centos/8/source/stable --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ for channel in "stable" "test" "nightly"
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-test.baseurl=https://download.docker.com/linux/centos/8/\$basearch/test --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-test-debuginfo.baseurl=https://download.docker.com/linux/centos/8/debug-\$basearch/test --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-test-source.baseurl=https://download.docker.com/linux/centos/8/source/test --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ for channel in "stable" "test" "nightly"
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-nightly.baseurl=https://download.docker.com/linux/centos/8/\$basearch/nightly --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-nightly-debuginfo.baseurl=https://download.docker.com/linux/centos/8/debug-\$basearch/nightly --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ sudo -E sh -c 'yum-config-manager --setopt=docker-ce-nightly-source.baseurl=https://download.docker.com/linux/centos/8/source/nightly --save'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
+ [[ 8.2 =~ 7\. ]]
+ '[' 8.2 == 7 ']'
+ sudo -E sh -c 'yum makecache'
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Docker CE Stable - x86_64                                                                                      137 kB/s |  14 kB     00:00
Red Hat Update Infrastructure 3 Client Configuration Server 8                                                   35 kB/s | 2.1 kB     00:00
Red Hat Enterprise Linux 8 for x86_64 - AppStream from RHUI (RPMs)                                              22 kB/s | 2.8 kB     00:00
Red Hat Enterprise Linux 8 for x86_64 - BaseOS from RHUI (RPMs)                                                 24 kB/s | 2.4 kB     00:00
Metadata cache created.
INFO: Searching repository for VERSION '20.10.7'
INFO: yum list --showduplicates 'docker-ce' | grep '20.10.7.*el' | tail -1 | awk '{print $2}'
+ '[' -n 20.10.7-3.el8 ']'
+ sudo -E sh -c 'yum install -y -q docker-ce-cli-20.10.7-3.el8'
warning: /var/cache/dnf/docker-ce-stable-fa9dc42ab4cec2f4/packages/docker-ce-cli-20.10.7-3.el8.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 621e9f35: NOKEY
Importing GPG key 0x621E9F35:
 Userid     : "Docker Release (CE rpm) <docker@docker.com>"
 Fingerprint: 060A 61C5 1B55 8A7F 742B 77AA C52F EB6B 621E 9F35
 From       : https://download.docker.com/linux/centos/gpg

Installed:
  docker-ce-cli-1:20.10.7-3.el8.x86_64                                  docker-scan-plugin-0.8.0-3.el8.x86_64

+ sudo -E sh -c 'yum install -y -q docker-ce-20.10.7-3.el8'


Installed:
  container-selinux-2:2.162.0-1.module+el8.4.0+11311+9da8acfb.noarch        containerd.io-1.4.6-3.1.el8.x86_64
  docker-ce-3:20.10.7-3.el8.x86_64                                          docker-ce-rootless-extras-20.10.7-3.el8.x86_64
  fuse-common-3.2.1-12.el8.x86_64                                           fuse-overlayfs-1.4.0-3.module+el8.4.0+11311+9da8acfb.x86_64
  fuse3-3.2.1-12.el8.x86_64                                                 fuse3-libs-3.2.1-12.el8.x86_64
  iptables-1.8.4-10.el8.x86_64                                              libcgroup-0.41-19.el8.x86_64
  libnetfilter_conntrack-1.0.6-5.el8.x86_64                                 libnfnetlink-1.0.1-13.el8.x86_64
  libnftnl-1.1.5-4.el8.x86_64                                               libslirp-4.3.1-1.module+el8.4.0+11311+9da8acfb.x86_64
  policycoreutils-python-utils-2.9-9.el8.noarch                             slirp4netns-1.1.8-1.module+el8.4.0+11311+9da8acfb.x86_64

+ '[' -n 1 ']'
+ sudo -E sh -c 'yum install -y -q docker-ce-rootless-extras-20.10.7-3.el8'
+ command_exists iptables
+ command -v iptables
+ start_docker
+ '[' '!' -z ']'
+ '[' -d /run/systemd/system ']'
+ sudo -E sh -c 'systemctl start docker'
+ sudo -E sh -c 'docker version'
Client: Docker Engine - Community
 Version:           20.10.7
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        f0df350
 Built:             Wed Jun  2 11:56:24 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.7
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       b0f5bc3
  Built:            Wed Jun  2 11:54:48 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.6
  GitCommit:        d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc:
  Version:          1.0.0-rc95
  GitCommit:        b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

================================================================================

To run Docker as a non-privileged user, consider setting up the
Docker daemon in rootless mode for your user:

    dockerd-rootless-setuptool.sh install

Visit https://docs.docker.com/go/rootless/ to learn about rootless mode.


To run the Docker daemon as a fully privileged service, but granting non-root
users access, refer to https://docs.docker.com/go/daemon-access/

WARNING: Access to the remote API on a privileged Docker daemon is equivalent
         to root access on the host. Refer to the 'Docker daemon attack surface'
         documentation for details: https://docs.docker.com/go/attack-surface/

================================================================================

[ec2-user@ip-192-169-2-250 ~]$



https://releases.rancher.com/install-docker/20.10.sh

Verify as before:

[ec2-user@ip-192-169-2-250 ~]$ cat /etc/redhat-release && docker --version
Red Hat Enterprise Linux release 8.4 (Ootpa)
Docker version 20.10.7, build f0df350

I’ll use this script in later versions of the ansible playbook.

Wrapping Up

I didn’t expect the process of installing docker on RHEL8 to be so easy, I expected to hit dependency issues, but it seems with the later versions of Docker 20.10 many of the install issues are fixed https://medium.com/nttlabs/docker-20-10-59cc4bd59d37 .

It is still not an ideal situation, Red Hat are unlikely to help with any container related issues on opening a support case where we are running docker and not podman on RHEL8, but docker appears to be stable.

Categories
Cloud

Upgrade PHP from 7.2 to 7.4 on Amazon Linux 2

In response to WordPress warning about the version of PHP, I decided I should upgrade:

My WordPress operates on an EC2 instance as described in a previous blog post and steps to update are described below:

STEP 1: ssh to the EC2 instance with PuTTY or similar (I’m using WSL – Windows Subsystem for Linux to connect from a Ubuntu shell).

STEP 2: Update the system, note this does not update php:

$ sudo yum update

STEP 3: Check the version of PHP and make sure the amazon-linux-extras package is installed:

$ php -v
$ which amazon-linux-extras

STEP 4: Verify the PHP7.x topic is available:

$ sudo amazon-linux-extras | grep php

STEP 5: Disable both the php7.2 and lamp-mariadb10.2-php7.2 topics

$ sudo amazon-linux-extras disable php7.2
$ sudo amazon-linux-extras disable lamp-mariadb10.2-php7.2

If you see the warning “Beware that disabling topics is not supported after they are installed.” it can be safely ignored.

STEP 6: Check the status of available topics

$ sudo amazon-linux-extras | grep php

STEP 7: Enable the php7.4 topic.

$ sudo amazon-linux-extras enable php7.4

STEP 8: Check with

$ sudo amazon-linux-extras | grep php

STEP 9: Update to PHP 7.4. First clean up the metadata and then install php along with any dependencies:

$ sudo yum clean metadata
$ sudo yum install php 

STEP 10: Check the version of PHP

$ php -v

STEP 11: Reboot the instance or just restart apache with:

$ sudo systemctl restart httpd

STEP 12: Test the WordPress site and verify

The steps above were sufficient for my setup but may be different in other environments.