While it’s fairly “trivial” to install a stacked kubernetes cluster with kubeadm on any cloud provider or managed bare metal (where you have a certain degree of management over the networking which permits you to use bgp for example), it’s not so trivial when your nodes are situated in different network segments (clouds) and/or behind NAT.
With this guide I will try to alleviate a pain related to this kind of setup.
Premise
In this scenario, we have 3 control plane nodes which are situated in different cloud providers and following applies:
- every node is NOT aware of its public ip (it has a private ip address and 1:1 nat with an unknown external ip)
- Their private ips are NOT routable between them, and potentially overlap (i.e. multiple cloud uses 10.0.0.0/24 range).
- we have access to a load balancer (I used ha-proxy), which will be used for control plane port. In case you don’t have any, you can use round robin DNS but it’s not advised.
- I am assuming we are using Ubuntu as underlying OS. In case you are using Centos or similar, you will have to adapt some sections.
We will also have some worker nodes which are distributed across our clouds, everything above applies as well.
Preparations
On every node perform following actions (hopefully you are deploying your infra with Terraform and you are able to use user-data script)
1#!/bin/bash
2
3# Find suitable version for kubeadm from here: https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages
4export KUBE_VERSION=1.22.5-00
5export DEBIAN_FRONTEND=noninteractive
6
7# some random dns resolution issue
8until curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
9do
10 echo "retrying download of k8s key"
11 sleep 5
12done
13echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | tee /etc/apt/sources.list.d/kubernetes.list
14set -e
15
16echo "Backing up and deleting ip tables"
17mkdir -p /root/iptables-backup/
18mv /etc/iptables/rules.* /root/iptables-backup/
19iptables-save > /root/iptables-rules
20iptables --flush
21
22echo "Creating netplan"
23export PUBLIC_IP=$(curl -s checkip.amazonaws.com)
24cat << EOF > /etc/netplan/60-floating-ip.yaml
25network:
26 version: 2
27 renderer: networkd
28 bridges:
29 dummy0:
30 dhcp4: no
31 dhcp6: no
32 accept-ra: no
33 interfaces: [ ]
34 addresses:
35 - ${PUBLIC_IP}/32
36EOF
37
38echo "Installing Kubeadm"
39cat <<EOF | tee /etc/modules-load.d/k8s.conf
40br_netfilter
41EOF
42
43cat <<EOF | tee /etc/sysctl.d/k8s.conf
44net.bridge.bridge-nf-call-ip6tables = 1
45net.bridge.bridge-nf-call-iptables = 1
46EOF
47
48sysctl --system
49
50echo "Running apt update"
51until apt-get update
52do
53 echo "error during apt-get update"
54done
55apt-get install -y apt-transport-https ca-certificates curl gnupg lsb-release
56apt-get install -y kubelet=${KUBE_VERSION} kubeadm=${KUBE_VERSION} kubectl=${KUBE_VERSION}
57apt-mark hold kubelet kubeadm kubectl
58
59echo "Installing Docker"
60mkdir -p /etc/docker/
61cat <<EOF | tee /etc/docker/daemon.json
62{
63 "exec-opts": ["native.cgroupdriver=systemd"],
64 "log-driver": "json-file",
65 "log-opts": {
66 "max-size": "100m",
67 "max-file": "3"
68 }
69}
70EOF
71
72echo "Installing docker"
73curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
74echo \
75 "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
76 $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
77
78apt-get update
79apt-get install -y docker-ce docker-ce-cli containerd.io
80apt-get dist-upgrade -y
81
82netplan apply
I will explain the script below.
Initial setup
# Find suitable version for kubeadm from here: https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages
export KUBE_VERSION=1.22.5-00
export DEBIAN_FRONTEND=noninteractive
Define which version of the kubernetes should be installed, and the DEBIAN_FRONTEND=noninteractive
will ensure that tasks like apt-get upgrade
will not be blocked by asking for human intervention
Install kubernetes repo key
# some random dns resolution issue
until curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
do
echo "retrying download of k8s key"
sleep 5
done
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | tee /etc/apt/sources.list.d/kubernetes.list
set -e
This block has 2 scopes:
- Ensure that there is an internet connectivity (I had some issues where the user-data was fired before internet interface was up)
- Install the kubernetes repository and related gpg key. (we will be using it a bit later)
Clean up existing iptables
echo "Backing up and deleting ip tables"
mkdir -p /root/iptables-backup/
mv /etc/iptables/rules.* /root/iptables-backup/
iptables-save > /root/iptables-rules
iptables --flush
If you are on the Oracle cloud, you will need to remove iptables rules since they would potentially interfere with rules created by kubernetes (to be fare, I still don’t understand the reasoning behind their inclusion since you own the underlying vpc and networking rules).
We also save a backup of existing rules to /root/iptables-rules
Public ip
echo "Creating netplan"
export PUBLIC_IP=$(curl -s checkip.amazonaws.com)
cat << EOF > /etc/netplan/60-floating-ip.yaml
network:
version: 2
renderer: networkd
bridges:
dummy0:
dhcp4: no
dhcp6: no
accept-ra: no
interfaces: [ ]
addresses:
- ${PUBLIC_IP}/32
EOF
netplan apply
As I said before, our clusters are not aware of their public ip, as consequence, by default, kubernetes will use their internal ip for the kubelet, and this will prevent other nodes (from other network segments) to reach one each other.
Kubernetes nodes can also have public_ip field, but unfortunately, the only way to set it up is through cloud provider. I thought about writing a “fake” cloud provider for this scope, but (at least for now) I’ve abandoned that idea.
So, instead, we are going to find our public ip (remember, in our scenario we have 1:1 nat) with PUBLIC
_IP=$(curl -s checkip.amazonaws.com)
, and then we are going to create a dummy interface with netaddress PUBLIC_IP/32
We will be using this dummy interface to assign public ip as internal in kubelet later on
Kubelet network requirements
Straight from the official documentation
cat <<EOF | tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
Install kubeadm
Install and hold kubeadm and related components
until apt-get update
do
echo "error during apt-get update"
done
apt-get install -y apt-transport-https ca-certificates curl gnupg lsb-release
apt-get install -y kubelet=${KUBE_VERSION} kubeadm=${KUBE_VERSION} kubectl=${KUBE_VERSION}
apt-mark hold kubelet kubeadm kubectl
Docker options
mkdir -p /etc/docker/
cat <<EOF | tee /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
}
}
EOF
We are going to be using Docker as the container runtime. In order to do so, we need to ensure that Docker’s cgroupdriver is the same as kubernetes
Since we are at it, also let’s define maximum size for log to avoid overfilling our hard drive with them.
Install Docker
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io
Final touches
apt-get dist-upgrade -y
Upgrade all packages (header included), and reboot the machine
Init first node
Cloud preparations
Control plane endpoint
We need to have a common endpoint for our HA to work. The best way to achieve it is to have a load balancer listening on tcp/6443
If your load balancer support https probes, then point it to /livez
, otherwise simple tcp probe will do.
If you don’t have any Loadbalancer available, you can simply point dns record A to your first node for now.
Ports
For our setup we will need following ports:
- tcp/6443 : control plane control port
- tcp/2349-2380: etcd
- tcp/10250: kubelet
Starting first control plane node
kubeadm init
Normally, if you had both private and public interface on your box you could use the --apiserver-advertise-address
flag, and everything would work as it should.
Etcd uses the value of that flag as the listening address, and sadly, if you are using a provider as Oracle cloud, Scaleway, Online.net, etc, even though your box is reachable on the PUBLIC_IP, if a service binds to PUBLIC_IP:PORT, you won’t be able to reach it outside the node because internally provider will route to your private ip.
As consequence, we need to pass configuration file to kubeadm to force etcd to bind on the proper interface while exposing the public one.
1export ENDPOINT=k8s.endpoint.dev:6443
2export PUBLIC_IP=$(curl -s checkip.amazonaws.com)
3export KUBECONFIG=/etc/kubernetes/admin.conf
4
5cat <<EOF > /etc/default/kubelet
6KUBELET_EXTRA_ARGS="--node-ip=${PUBLIC_IP}"
7EOF
8
9cat <<EOF > /root/kubeadm-config.yaml
10apiVersion: kubeadm.k8s.io/v1beta3
11kind: InitConfiguration
12localAPIEndpoint:
13 advertiseAddress: ${PUBLIC_IP}
14 bindPort: 6443
15---
16apiVersion: "kubeadm.k8s.io/v1beta3"
17kind: ClusterConfiguration
18networking:
19 podSubnet: "10.244.0.0/16"
20controlPlaneEndpoint: ${ENDPOINT}
21etcd:
22 local:
23 extraArgs:
24 listen-peer-urls: https://0.0.0.0:2380
25 listen-client-urls: https://0.0.0.0:2379
26 listen-metrics-urls: http://0.0.0.0:2381
27controllerManager:
28 extraArgs:
29 bind-address: "0.0.0.0"
30scheduler:
31 extraArgs:
32 bind-address: "0.0.0.0"
33EOF
34
35kubeadm init --config=/root/kubeadm-config.yaml --upload-certs --v=5
If all went well, you should receive a message similar to this
1Your Kubernetes control-plane has initialized successfully!
2...
3 kubeadm join k8s.endpoint.dev:6443 --token hlpq3v.82***abrnw0z \
4 --discovery-token-ca-cert-hash sha256:4a0ff99bb3059bc6c38f5b1c227805aa344e1ec7c424f870c9d175b50801b1c9 \
5 --control-plane --certificate-key 48e47af7fa0e22b7f1e8d53***0af08e50f613734be7f4cba731733f3b83
6...
Keep a note of the variables above, we will need them shortly
Verify that everything is working fine (coredns pods will be pending because our node is no ready yet, we will get there)
1
2# kubectl get nodes -o wide
3NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
4k8s-master-01 NotReady control-plane,master 66s v1.22.2 172.24.246.28 <none> Ubuntu 20.04.3 LTS 5.11.0-1019-oracle docker://20.10.8
cni
At this point if you view your node, it will be marked as Not Ready
due to missing cni.
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
If everything went well, you should see the following picture:
1# kubectl get nodes -o wide
2NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
3k8s-master-01 Ready control-plane,master 2m5s v1.22.2 172.24.246.28 <none> Ubuntu 20.04.3 LTS 5.11.0-1019-oracle docker://20.10.8
Other control planes
1export ENDPOINT=k8s.endpoint.dev:6443
2export PUBLIC_IP=$(curl -s checkip.amazonaws.com)
3export TOKEN=hlpq3v.82***abrnw0z
4export CA_CERT_HASH=sha256:4a0ff99bb3059bc6c38f5b1c227805aa344e1ec7c424f870c9d175b50801b1c9
5export CERT_KEY=48e47af7fa0e22b7f1e8d53***0af08e50f613734be7f4cba731733f3b83
6export KUBECONFIG=/etc/kubernetes/admin.conf
7
8cat <<EOF > /etc/default/kubelet
9KUBELET_EXTRA_ARGS="--node-ip=${PUBLIC_IP}"
10EOF
11
12cat <<EOF > /root/kubeadm-config.yaml
13---
14apiVersion: kubeadm.k8s.io/v1beta3
15kind: JoinConfiguration
16caCertPath: /etc/kubernetes/pki/ca.crt
17discovery:
18 bootstrapToken:
19 apiServerEndpoint: ${ENDPOINT}
20 token: ${TOKEN}
21 caCertHashes:
22 - ${CA_CERT_HASH}
23 unsafeSkipCAVerification: false
24controlPlane:
25 certificateKey: ${CERT_KEY}
26 localAPIEndpoint:
27 advertiseAddress: ${PUBLIC_IP}
28 bindPort: 6443
29---
30apiVersion: "kubeadm.k8s.io/v1beta3"
31kind: ClusterConfiguration
32controlPlaneEndpoint: ${ENDPOINT}
33etcd:
34 local:
35 extraArgs:
36 listen-peer-urls: https://0.0.0.0:2380
37 listen-client-urls: https://0.0.0.0:2379
38 listen-metrics-urls: http://0.0.0.0:2381
39controllerManager:
40 extraArgs:
41 bind-address: "0.0.0.0"
42scheduler:
43 extraArgs:
44 bind-address: "0.0.0.0"
45---
46apiVersion: kubelet.config.k8s.io/v1beta1
47kind: KubeletConfiguration
48serverTLSBootstrap: true
49EOF
50
51kubeadm join --config=/root/kubeadm-config.yaml --v=5
Worker nodes
If you are joining a new worker node after a bit of time since the creation of the cluster, you will need to obtain a new token from a control plane by running kubeadm token create --print-join-command
1export PUBLIC_IP=$(curl -s checkip.amazonaws.com)
2
3cat <<EOF > /etc/default/kubelet
4KUBELET_EXTRA_ARGS="--node-ip=${PUBLIC_IP}"
5EOF
6
7kubeadm join k8s.endpoint.dev:6443 --token hlpq3v.82***abrnw0z --discovery-token-ca-cert-hash sha256:4a0ff99bb3059bc6c38f5b1c227805aa344e1ec7c424f870c9d175b50801b1c9
remove taint
This step is optional, in my setup I have limited amount of nodes, so I do want to schedule workloads on my control plane nodes.
You can skip this, just remember to add tolerances to anything you deploy from this step onwards in the guide.
1# kubectl taint node --all node-role.kubernetes.io/master:NoSchedule-
2node/k8s-master-01 untainted
3node/k8s-master-02 untainted
4node/k8s-master-03 untainted
Comments