An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation

When working with EKS under AWS, it’s possible that at some point you wanted to run a pod under a certain role, and you’ve encountered a following error:

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

What’s frustrating, is that by default AWS doesn’t provide you a lot of feedback of why that error happened.

So I’ve written down some debug steps for further reference:

Check that the service account is set up correctly by running an ephemeral pod

kubectl run -ti --restart=Never --rm debug --image=amazon/aws-cli --overrides='{ "spec": { "serviceAccount": "terraform-runner" } }' -- sts get-caller-identity

This is what I got in response

1An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
2pod "debug" deleted
3pod runners/debug terminated (Error)

Which indicates that the link between service account and IAM role is not working as it should.

Next step was to check the Service account

 1apiVersion: v1
 2kind: ServiceAccount
 4  annotations:
 5 arn:aws:iam::00000000:role/terraform-runner
 6 |
 7            {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{"":"arn:aws:iam::00000000:role/terraform-runner"},"labels":{"":"runners"},"name":"terraform-runner","namespace":"runners"}}
 8  creationTimestamp: "2021-09-17T11:18:13Z"
 9  labels:
10 runners
11  name: terraform-runner
12  namespace: runners
13  resourceVersion: "34794913"
14  uid: d84ceb7f-4fc7-41fa-af97-9b38c7510a94
16- name: terraform-runner-token-gtm4x

We can see that service account is correctly configured with addition of,

Let’s check the iam role with aws cli aws iam get-role --role-name terraform-runner

 2    "Role": {
 3        "Path": "/",
 4        "RoleName": "terraform-runner",
 5        "RoleId": "AROATWMTMIX62NAAAJADH",
 6        "Arn": "arn:aws:iam::000000:role/terraform-runner",
 7        "CreateDate": "2021-09-17T09:26:19+00:00",
 8        "AssumeRolePolicyDocument": {
 9            "Version": "2012-10-17",
10            "Statement": [
11                {
12                    "Sid": "",
13                    "Effect": "Allow",
14                    "Principal": {
15                        "Federated": "arn:aws:iam::000000:oidc-provider/"
16                    },
17                    "Action": "sts:AssumeRoleWithWebIdentity",
18                    "Condition": {
19                        "StringEquals": {
20                            "": "system:serviceaccount:runners:ecr-publisher"
21                        }
22                    }
23                }
24            ]
25        },
26        "MaxSessionDuration": 3600,
27        "RoleLastUsed": {}
28    }

And there lies the problem, from the definition above we can see that we are permitting to assume this role to system:serviceaccount:runners:ecr-publisher while our service account is called terraform-runner (a copy/paste error in my terraform stack).

Once the error has been fixed, we can assume the role correctly

1kubectl run -ti --restart=Never --rm debug --image=amazon/aws-cli --overrides='{ "spec": { "serviceAccount": "terraform-runner" }  }' -- sts get-caller-identity
3If you don't see a command prompt, try pressing enter.
5    "UserId": "AROATWMTM0000N6HAJADH:botocore-session-11111",
6    "Account": "000000",
7    "Arn": "arn:aws:sts::000000:assumed-role/terraform-runner/botocore-session-1111"
9pod "debug" deleted

Note: if the command above works, and you pod still don’t, recreate the pod. Sometimes changes to the service accounts are not refreshed until the restart of the pod using it.

