When working with EKS under AWS, it’s possible that at some point you wanted to run a pod under a certain role, and you’ve encountered a following error:
An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
What’s frustrating, is that by default AWS doesn’t provide you a lot of feedback of why that error happened.
So I’ve written down some debug steps for further reference:
Check that the service account is set up correctly by running an ephemeral pod
kubectl run -ti --restart=Never --rm debug --image=amazon/aws-cli --overrides='{ "spec": { "serviceAccount": "terraform-runner" } }' -- sts get-caller-identity
This is what I got in response
1An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
2pod "debug" deleted
3pod runners/debug terminated (Error)
Which indicates that the link between service account and IAM role is not working as it should.
Next step was to check the Service account
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4 annotations:
5 eks.amazonaws.com/role-arn: arn:aws:iam::00000000:role/terraform-runner
6 kubectl.kubernetes.io/last-applied-configuration: |
7 {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{"eks.amazonaws.com/role-arn":"arn:aws:iam::00000000:role/terraform-runner"},"labels":{"argocd.argoproj.io/instance":"runners"},"name":"terraform-runner","namespace":"runners"}}
8 creationTimestamp: "2021-09-17T11:18:13Z"
9 labels:
10 argocd.argoproj.io/instance: runners
11 name: terraform-runner
12 namespace: runners
13 resourceVersion: "34794913"
14 uid: d84ceb7f-4fc7-41fa-af97-9b38c7510a94
15secrets:
16- name: terraform-runner-token-gtm4x
We can see that service account is correctly configured with addition of eks.amazonaws.com/role-arn
,
Let’s check the iam role with aws cli aws iam get-role --role-name terraform-runner
1{
2 "Role": {
3 "Path": "/",
4 "RoleName": "terraform-runner",
5 "RoleId": "AROATWMTMIX62NAAAJADH",
6 "Arn": "arn:aws:iam::000000:role/terraform-runner",
7 "CreateDate": "2021-09-17T09:26:19+00:00",
8 "AssumeRolePolicyDocument": {
9 "Version": "2012-10-17",
10 "Statement": [
11 {
12 "Sid": "",
13 "Effect": "Allow",
14 "Principal": {
15 "Federated": "arn:aws:iam::000000:oidc-provider/oidc.eks.eu-west-2.amazonaws.com/id/C2208D308EF0087F1B87E86008A8FAC2"
16 },
17 "Action": "sts:AssumeRoleWithWebIdentity",
18 "Condition": {
19 "StringEquals": {
20 "oidc.eks.eu-west-2.amazonaws.com/id/C2208D308EF0087F1B87E86500A8FAC2:sub": "system:serviceaccount:runners:ecr-publisher"
21 }
22 }
23 }
24 ]
25 },
26 "MaxSessionDuration": 3600,
27 "RoleLastUsed": {}
28 }
29}
And there lies the problem, from the definition above we can see that we are permitting to assume this role to system:serviceaccount:runners:ecr-publisher
while our service account is called terraform-runner
(a copy/paste error in my terraform stack).
Once the error has been fixed, we can assume the role correctly
1kubectl run -ti --restart=Never --rm debug --image=amazon/aws-cli --overrides='{ "spec": { "serviceAccount": "terraform-runner" } }' -- sts get-caller-identity
2
3If you don't see a command prompt, try pressing enter.
4{
5 "UserId": "AROATWMTM0000N6HAJADH:botocore-session-11111",
6 "Account": "000000",
7 "Arn": "arn:aws:sts::000000:assumed-role/terraform-runner/botocore-session-1111"
8}
9pod "debug" deleted
Note: if the command above works, and you pod still don’t, recreate the pod. Sometimes changes to the service accounts are not refreshed until the restart of the pod using it.
Comments