When working with EKS under AWS, it’s possible that at some point you wanted to run a pod under a certain role, and you’ve encountered a following error:
An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
What’s frustrating, is that by default AWS doesn’t provide you a lot of feedback of why that error happened.
So I’ve written down some debug steps for further reference:
Check that the service account is set up correctly by running an ephemeral pod
kubectl run -ti --restart=Never --rm debug --image=amazon/aws-cli --overrides='{ "spec": { "serviceAccount": "terraform-runner" } }' -- sts get-caller-identity
This is what I got in response
1An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
2pod "debug" deleted
3pod runners/debug terminated (Error)
Which indicates that the link between service account and IAM role is not working as it should.
Next step was to check the Service account
 1apiVersion: v1
 2kind: ServiceAccount
 3metadata:
 4  annotations:
 5    eks.amazonaws.com/role-arn: arn:aws:iam::00000000:role/terraform-runner
 6    kubectl.kubernetes.io/last-applied-configuration: |
 7            {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{"eks.amazonaws.com/role-arn":"arn:aws:iam::00000000:role/terraform-runner"},"labels":{"argocd.argoproj.io/instance":"runners"},"name":"terraform-runner","namespace":"runners"}}
 8  creationTimestamp: "2021-09-17T11:18:13Z"
 9  labels:
10    argocd.argoproj.io/instance: runners
11  name: terraform-runner
12  namespace: runners
13  resourceVersion: "34794913"
14  uid: d84ceb7f-4fc7-41fa-af97-9b38c7510a94
15secrets:
16- name: terraform-runner-token-gtm4x
We can see that service account is correctly configured with addition of eks.amazonaws.com/role-arn,
Let’s check the iam role with aws cli aws iam get-role --role-name terraform-runner
 1{
 2    "Role": {
 3        "Path": "/",
 4        "RoleName": "terraform-runner",
 5        "RoleId": "AROATWMTMIX62NAAAJADH",
 6        "Arn": "arn:aws:iam::000000:role/terraform-runner",
 7        "CreateDate": "2021-09-17T09:26:19+00:00",
 8        "AssumeRolePolicyDocument": {
 9            "Version": "2012-10-17",
10            "Statement": [
11                {
12                    "Sid": "",
13                    "Effect": "Allow",
14                    "Principal": {
15                        "Federated": "arn:aws:iam::000000:oidc-provider/oidc.eks.eu-west-2.amazonaws.com/id/C2208D308EF0087F1B87E86008A8FAC2"
16                    },
17                    "Action": "sts:AssumeRoleWithWebIdentity",
18                    "Condition": {
19                        "StringEquals": {
20                            "oidc.eks.eu-west-2.amazonaws.com/id/C2208D308EF0087F1B87E86500A8FAC2:sub": "system:serviceaccount:runners:ecr-publisher"
21                        }
22                    }
23                }
24            ]
25        },
26        "MaxSessionDuration": 3600,
27        "RoleLastUsed": {}
28    }
29}
And there lies the problem, from the definition above we can see that we are permitting to assume this role to system:serviceaccount:runners:ecr-publisher while our service account is called terraform-runner (a copy/paste error in my terraform stack).
Once the error has been fixed, we can assume the role correctly
1kubectl run -ti --restart=Never --rm debug --image=amazon/aws-cli --overrides='{ "spec": { "serviceAccount": "terraform-runner" }  }' -- sts get-caller-identity
2
3If you don't see a command prompt, try pressing enter.
4{
5    "UserId": "AROATWMTM0000N6HAJADH:botocore-session-11111",
6    "Account": "000000",
7    "Arn": "arn:aws:sts::000000:assumed-role/terraform-runner/botocore-session-1111"
8}
9pod "debug" deleted
Note: if the command above works, and you pod still don’t, recreate the pod. Sometimes changes to the service accounts are not refreshed until the restart of the pod using it.

Comments