Sometimes its useful to be able to run some ephemeral containers on kubernetes cluster in order to perform some debugging (i.e. dns resolution, pinging nodes, etc).

Sadly, most of the times, it can be tricky to remember all overrides so here is a small list

Run an ephemeral shell on a random node

1kubectl run -ti...

Kubernetes networking

Pause container

In Kubernetes, the pause container serves as the “parent container” for all of the containers in your pod, and it has two main responsibilities:

  • it serves as the basis of Linux namespace sharing in the pod
  • with PID (process ID) namespace sharing enabled, it serves as...

While it’s fairly “trivial” to install a stacked kubernetes cluster with kubeadm on any cloud provider or managed bare metal (where you have a certain degree of management over the networking which permits you to use bgp for example), it’s not so trivial when your nodes are situated in different network segments (clouds) and/or behind NAT.

With this guide I will try to alleviate a pain related to this kind of setup.

When working with EKS under AWS, it’s possible that at some point you wanted to run a pod under a certain role, and you’ve encountered a following error:

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

What’s frustrating, is that by default AWS doesn’t provide you a lot of feedback of why that error happened.

So I’ve written down some debug steps for further reference:

One of my etcd nodes in my home k8s cluster has been failing with following message:

12021-01-14 11:16:09.233458 I | embed: listening for peers on 192.168.0.33:2380
2raft2021/01/14 11:16:09 tocommit(29492601) is out of range [lastIndex(29492469)]. Was the raft log corrupted, truncated, or lost?
3panic: tocommit(29492601) is out of range [lastIndex(29492469)]. Was the raft log corrupted, truncated, or lost?

These are steps I took to fix it: